Importing and Exporting (I/O)
Importing data from tabular data files
To read data from a CSV-like file, use the readtable function:
#DataFrames.readtable — Function.
Read data from a tabular-file format (CSV, TSV, ...)
readtable(filename; header::Bool = true, separator::Char = getseparator(pathname), quotemark::Vector{Char} = ['"'], decimal::Char = '.', nastrings::Vector = ASCIIString["", "NA"], truestrings::Vector = ASCIIString["T", "t", "TRUE", "true"], falsestrings::Vector = ASCIIString["F", "f", "FALSE", "false"], makefactors::Bool = false, nrows::Integer = -1, names::Vector = Symbol[], eltypes::Vector{DataType} = DataType[], allowcomments::Bool = false, commentmark::Char = '#', ignorepadding::Bool = true, skipstart::Integer = 0, skiprows::AbstractVector{Int} = Int[], skipblanks::Bool = true, encoding::Symbol = :utf8, allowescapes::Bool = false, normalizenames::Bool = true)
Arguments
filename: the filename to be read
Keyword Arguments
header::Bool– Use the information from the file's header line to determine column names. Defaults totrue.separator::Char– Assume that fields are split by theseparatorcharacter. If not specified, it will be guessed from the filename:.csvdefaults to',',.tsvdefaults to' ',.wsvdefaults to' '.quotemark::Vector{Char}– Assume that fields contained inside of twoquotemarkcharacters are quoted, which disables processing of separators and linebreaks. Set toChar[]to disable this feature and slightly improve performance. Defaults to['"'].decimal::Char– Assume that the decimal place in numbers is written using thedecimalcharacter. Defaults to'.'.nastrings::Vector{ASCIIString}– Translate any of the strings into this vector into anNA. Defaults to["", "NA"].truestrings::Vector{ASCIIString}– Translate any of the strings into this vector into a Booleantrue. Defaults to["T", "t", "TRUE", "true"].falsestrings::Vector{ASCIIString}– Translate any of the strings into this vector into a Booleanfalse. Defaults to["F", "f", "FALSE", "false"].makefactors::Bool– Convert string columns intoPooledDataVector's for use as factors. Defaults tofalse.nrows::Int– Read onlynrowsfrom the file. Defaults to-1, which indicates that the entire file should be read.names::Vector{Symbol}– Use the values in this array as the names for all columns instead of or in lieu of the names in the file's header. Defaults to[], which indicates that the header should be used if present or that numeric names should be invented if there is no header.eltypes::Vector{DataType}– Specify the types of all columns. Defaults to[].allowcomments::Bool– Ignore all text inside comments. Defaults tofalse.commentmark::Char– Specify the character that starts comments. Defaults to'#'.ignorepadding::Bool– Ignore all whitespace on left and right sides of a field. Defaults totrue.skipstart::Int– Specify the number of initial rows to skip. Defaults to0.skiprows::Vector{Int}– Specify the indices of lines in the input to ignore. Defaults to[].skipblanks::Bool– Skip any blank lines in input. Defaults totrue.encoding::Symbol– Specify the file's encoding as either:utf8or:latin1. Defaults to:utf8.normalizenames::Bool– Ensure that column names are valid Julia identifiers. For instance this renames a column named"a b"to"a_b"which can then be accessed with:a_binstead ofsymbol("a b"). Defaults totrue.
Result
::DataFrame
Examples
df = readtable("data.csv") df = readtable("data.tsv") df = readtable("data.wsv") df = readtable("data.txt", separator = ' ') df = readtable("data.txt", header = false)
Exporting data to a tabular data file
To write data to a CSV file, use the writetable function:
#DataFrames.writetable — Function.
Write data to a tabular-file format (CSV, TSV, ...)
writetable(filename::AbstractString, df::AbstractDataFrame; header::Bool = true, separator::Char = getseparator(filename), quotemark::Char = '"', nastring::AbstractString = "NA", append::Bool = false)
Arguments
filename: the filename to be createddf: the AbstractDataFrame to be written
Keyword Arguments
separator::Char– The separator character that you would like to use. Defaults to the output ofgetseparator(filename), which uses commas for files that end in.csv, tabs for files that end in.tsvand a single space for files that end in.wsv.quotemark::Char– The character used to delimit string fields. Defaults to'"'.header::Bool– Should the file contain a header that specifies the column names fromdf. Defaults totrue.nastring::AbstractString– What to write in place of missing data. Defaults to"NA".
Result
::DataFrame
Examples
df = DataFrame(A = 1:10) writetable("output.csv", df) writetable("output.dat", df, separator = ',', header = false) writetable("output.dat", df, quotemark = ''', separator = ',') writetable("output.dat", df, header = false)