just some explanation aside. Before you can use pd.read_csv to import your data, you need to locate your data in your filesystem.
Asuming you use a jupyter notebook or pyton file and the csv-file is in the same directory you are currently working in, you just can use:
import pandas as pd SouthKoreaRoads_df = pd.read_csv('SouthKoreaRoads.csv')
If the file is located in another directy, you need to specify this directory. For example if the csv is in a subdirectry (in respect to the python / jupyter you are working on) you need to add the directories name. If its in folder "data" then add data in front of the file seperated with a "/"
import pandas as pd SouthKoreaRoads_df = pd.read_csv('data/SouthKoreaRoads.csv')
Pandas accepts every valid string path and URLs, thereby you could also give a full path.
import pandas as pd SouthKoreaRoads_df = pd.read_csv('C:\Users\Ron\Desktop\Clients.csv')
so until now no OS-package needed. Pandas read_csv can also pass OS-Path-like-Objects but the use of OS is only needed if you want specify a path in a variable before accessing it or if you do complex path handling, maybe because the code you are working on needs to run in a nother environment like a webapp where the path is relative and could change if deployed differently.
please see also:
https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html https://docs.python.org/3/library/os.path.html
BR
Answer from fabigr8 on Stack Overflowpython - How to read CSV file using Pandas (Jupyter notebooks) - Stack Overflow
I wrote a detailed guide of how Pandas' read_csv() function actually works and the different engine options available, including new features in v2.0. Figured it might be of interest here!
python - Import CSV file as a Pandas DataFrame - Stack Overflow
python - module 'pandas' has no attribute 'read_csv - Stack Overflow
Videos
just some explanation aside. Before you can use pd.read_csv to import your data, you need to locate your data in your filesystem.
Asuming you use a jupyter notebook or pyton file and the csv-file is in the same directory you are currently working in, you just can use:
import pandas as pd SouthKoreaRoads_df = pd.read_csv('SouthKoreaRoads.csv')
If the file is located in another directy, you need to specify this directory. For example if the csv is in a subdirectry (in respect to the python / jupyter you are working on) you need to add the directories name. If its in folder "data" then add data in front of the file seperated with a "/"
import pandas as pd SouthKoreaRoads_df = pd.read_csv('data/SouthKoreaRoads.csv')
Pandas accepts every valid string path and URLs, thereby you could also give a full path.
import pandas as pd SouthKoreaRoads_df = pd.read_csv('C:\Users\Ron\Desktop\Clients.csv')
so until now no OS-package needed. Pandas read_csv can also pass OS-Path-like-Objects but the use of OS is only needed if you want specify a path in a variable before accessing it or if you do complex path handling, maybe because the code you are working on needs to run in a nother environment like a webapp where the path is relative and could change if deployed differently.
please see also:
https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html https://docs.python.org/3/library/os.path.html
BR
SouthKoreaRoads = pd.read_csv("./SouthKoreaRoads.csv")
Try this and see whether it could help!
pandas.read_csv to the rescue:
import pandas as pd
df = pd.read_csv("data.csv")
print(df)
This outputs a pandas DataFrame:
Date price factor_1 factor_2
0 2012-06-11 1600.20 1.255 1.548
1 2012-06-12 1610.02 1.258 1.554
2 2012-06-13 1618.07 1.249 1.552
3 2012-06-14 1624.40 1.253 1.556
4 2012-06-15 1626.15 1.258 1.552
5 2012-06-16 1626.15 1.263 1.558
6 2012-06-17 1626.15 1.264 1.572
To read a CSV file as a pandas DataFrame, you'll need to use pd.read_csv, which has sep=',' as the default.
But this isn't where the story ends; data exists in many different formats and is stored in different ways so you will often need to pass additional parameters to read_csv to ensure your data is read in properly.
Here's a table listing common scenarios encountered with CSV files along with the appropriate argument you will need to use. You will usually need all or some combination of the arguments below to read in your data.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββββββββ
β pandas Implementation β Argument β Description β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β pd.read_csv(..., sep=';') β sep/delimiter β Read CSV with different separatorΒΉ β
β pd.read_csv(..., delim_whitespace=True) β delim_whitespace β Read CSV with tab/whitespace separator β
β pd.read_csv(..., encoding='latin-1') β encoding β Fix UnicodeDecodeError while readingΒ² β
β pd.read_csv(..., header=False, names=['x', 'y', 'z']) β header and names β Read CSV without headersΒ³ β
β pd.read_csv(..., index_col=[0]) β index_col β Specify which column to set as the indexβ΄ β
β pd.read_csv(..., usecols=['x', 'y']) β usecols β Read subset of columns β
β pd.read_csv(..., thousands='.', decimal=',') β thousands and decimal β Numeric data is in European format (eg., 1.234,56) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββββββββββββββββ
Footnotes
By default,
read_csvuses a C parser engine for performance. The C parser can only handle single character separators. If your CSV has a multi-character separator, you will need to modify your code to use the'python'engine. You can also pass regular expressions:df = pd.read_csv(..., sep=r'\s*\|\s*', engine='python')
UnicodeDecodeErroroccurs when the data was stored in one encoding format but read in a different, incompatible one. Most common encoding schemes are'utf-8'and'latin-1', your data is likely to fit into one of these.
header=Falsespecifies that the first row in the CSV is a data row rather than a header row, and thenames=[...]allows you to specify a list of column names to assign to the DataFrame when it is created."Unnamed: 0" occurs when a DataFrame with an un-named index is saved to CSV and then re-read after. Instead of having to fix the issue while reading, you can also fix the issue when writing by using
df.to_csv(..., index=False)
There are other arguments I've not mentioned here, but these are the ones you'll encounter most frequently.