This guid is based off of this tutorial. Check it out if you want a more indepth explanation.
To start you will need a tab or comma delimted file of your activities and dates that you want to use to make the timeline. For our very simple graph we have our item number, the activity, the start date, and the end date columns. You will also need to have the R package tidyverse.
The first thing that we will need to do is load the chart into R.
gantt <- read.csv("gantt_ROSF.csv", h=T)
head(gantt)
## Item Activity Start End
## 1 1 Aiptasia culturing 2020.06.01 2020.09.30
## 2 2 Sample collection 2020.08.01 2020.09.30
## 3 3 Extractions of RNA, DNA , metabolites 2020.10.01 2020.12.31
## 4 4 16S microbiome library sequencing 2021.01.01 2021.02.28
## 5 5 Metagenomic library sequencing 2021.01.01 2021.03.31
## 6 6 Metatranscriptomic library sequencing 2021.04.01 2021.06.30
Now we need to create a list of activities. We can do this by subsetting our column of activities. Also because of the way that ggplot works we will need to reverse the order of our list if we want everything to go in order.
acts <- rev(gantt[[2]])
acts
## [1] Preparation of manuscript(s) Multi-omics data analyis
## [3] Metabolomics assays Metatranscriptomic library sequencing
## [5] Metagenomic library sequencing 16S microbiome library sequencing
## [7] Extractions of RNA, DNA , metabolites Sample collection
## [9] Aiptasia culturing
## 9 Levels: 16S microbiome library sequencing ... Sample collection
Cool our activities are looking nice. Now we need format the columns so that the dates are all part of a single column. We can achieve this by classifying each date as a start or end date. Luckily there is a tool, gather that collapses columns into key, value pairs. At the same time we can turn our Activities into factors with defined levels so that our chart comes out in the order that we want.
g.gantt <- gather(gantt, "state", "date", 3:4) %>% mutate(date = as.Date(date, "%Y.%m.%d"), Activity=factor(Activity,as.ordered(acts)))
## Warning: attributes are not identical across measure variables;
## they will be dropped
To create nice evenly spaced dates on the x-axis we can use the the seq.Date function to return a list of dates between the start and end of the project, at the specified interval. We can use this inside the scale_x_date to achieve an evenly spaced x-axis.
seq.Date(as.Date("2020-06-01"), as.Date("2022-06-01"),"quarter")
## [1] "2020-06-01" "2020-09-01" "2020-12-01" "2021-03-01" "2021-06-01" "2021-09-01"
## [7] "2021-12-01" "2022-03-01" "2022-06-01"
# scale_x_date(breaks=seq.Date(as.Date("2020-06-01"), as.Date("2022-06-01"), "quarter"), labels=c("6.1.20", "", "12.1.20", "", "6.1.21", "", "12.1.21","","6.1.22"))
Now we are ready to plot the chart.
#pdf(file = "Gantt2.pdf",height=8.5,width=11)
ggplot(g.gantt, aes(date, Activity, color = Activity, group=Item)) +
geom_line(size = 10) +
labs(x="Date", y=NULL, title="Timeline") +
scale_x_date(breaks=seq.Date(as.Date("2020-06-01"), as.Date("2022-06-01"), "quarter"), labels=c("6.1.20", "", "12.1.20", "", "6.1.21", "", "12.1.21","","6.1.22"), expand = c(0,1)) +
theme_gray(base_size=14) +
theme_bw() +
theme(legend.position = "none", axis.text.x = element_text(size=10),
axis.title.x = element_text(size=16),
axis.text.y = element_text(size=14),
title = element_text(size=20)) +
theme(plot.margin = unit(c(5.1, 5.5, 4.1, 2.1),"mm"))
#dev.off()
If something wasnt covered or if you would like more information about ggplot this website has alot of really great information. Give it a look!