EconDatasets.jl
Table of Contents
1 Motivation
The package shall make various econometric data sets accessible, similar to the way that the RDatasets package provides access to standard data sets that are available in R.
The problem with some econometric data sets, however, is that although data is freely available at the web, it is not allowed to redistribute the data set. Hence, some of the data sets are not shipped automatically with the package, but they need to be downloaded into the package's data directory first.
Once a given data set is downloaded, it can be loaded in Julia with a
syntax similar to RDatasets
:
using EconDatasets sectors = dataset("Sectors")
In order to download a given data set, one needs to use call function
getDataset
first. This way, data is downloaded into the data
directory of the package, making it accessible for future use with
function dataset
.
getDataset("FFF") # Fama French factors
In addition, there also exist functions that directly load data into an interactive Julia session, without storing them on disk for future use. For some of these data sets, however, some minor manual interaction might be required to label and process the data adequately.
2 Table of data sets
Name | Description |
---|---|
Sectors | Sector affiliations for SP500 components |
SP500 | Stock price data for SP500 components |
UMD | Fama French momentum portfolio |
FFF | Fama French factors |
SP500Ticker | SP500 ticker symbols from Wikipedia |
Indices | Major stock price indices |
Treasuries | US Treasury rates, several maturities |
DieboldLi | fixed-maturity yields used in Diebold-Li |
From these data sets, only the following ones are already included in the repository:
- Sectors
- SP500Ticker
3 Data sets to be downloaded first
The following data sets do not ship with the package, as they may not
be re-distributed. Hence, they need to be downloaded first. This can
be done with function getDataset
. Some of these data sets will call
a script and not a function, so they will create some variables in
your workspace. Also, if running the scripts through the high level
function getDataset
causes an error, it still might work if you run
the script with include
:
include(joinpath(Pkg.dir("EconDatasets"), "src/getDataset/", "getSP500.jl"))
- Data sets downloaded by functions
- FFF
- UMD
- Indices
- Treasuries
- DieboldLi
- Data sets downloaded by scripts
- SP500 (due to parallel computing)
- SP500Ticker (due to Gumbo package usage)
4 Table of functions to interactively download data
Name | Description |
---|---|
readFamaFrench |
|
readFamaFrenchRaw |
see ijulia example |
readYahooFinance |
see blog post |
readYahooAdjClose |
see blog post |
getAdjClose |
|
readUSTreasuries |
5 Acknowledgement
Of course, any package can only be as good as the individual parts that it builds on. Accordingly, I'd like to thank all people that were involved in the development of all the functions that were made ready to use for me to build this package upon. In particular, I want to thank the developers of
- the Julia language, for their continuous and tremendous efforts during the creation of this free, fast and highly flexible programming language!
- the DataFrames package, which definitely provides the best
representation for general types of data in data analysis. It's a
role model that every last bit of code of
TimeData
depends on, and the interface that every statistics package should use. - the Datetime package, which is a thoughtful implementation of
dates, time and durations, and the backbone of all time components
in
TimeData
. - the TimeSeries package, which follows a different approach to
handling time series data. Having a quite similar goal in mind, the
package was a great inspiration for me, and occasionally I even
could borrow parts of code from it (for example, from an old version
of function
readtime
).