You can also try this :
df = DataFrame(series).transpose()
Using the transpose() function you can interchange the indices and the columns. The output looks like this :
a b c
0 1 2 3
Answer from PJay on Stack OverflowYou can also try this :
df = DataFrame(series).transpose()
Using the transpose() function you can interchange the indices and the columns. The output looks like this :
a b c
0 1 2 3
You don't need the transposition step, just wrap your Series inside a list and pass it to the DataFrame constructor:
pd.DataFrame([series])
a b c
0 1 2 3
Alternatively, call Series.to_frame, then transpose using the shortcut .T:
series.to_frame().T
a b c
0 1 2 3
I don't have your dataframe, but here's an example with small data to show that pandas contructs the dataframe as expected (using pandas 0.15.1 and python 3.4). As expected, NaNs are introduced when the indices don't match.
The last row of your data is ('intercept', ''), while all the other rows are numbers. So ('intercept', '') goes to the index of each series, and this is probably causing the values in the index to be "promoted" to strings.
>> s1 = pd.Series([1,2,3], index=pd.MultiIndex.from_tuples([(1,1),(1,2),(1,3)], names=['a','b']))
>>> s1
a b
1 1 1
2 2
3 3
dtype: int64
>>> s2 = pd.Series([100,200,300], index=pd.MultiIndex.from_tuples([(1,2),(1,3),(1,4)], names=['a','b']))
>>>
>>> s2
a b
1 2 100
3 200
4 300
dtype: int64
>>> df = pd.DataFrame({'s1':s1, 's2':s2})
>>> df
s1 s2
a b
1 1 1 NaN
2 2 100
3 3 200
4 NaN 300
>>> df.index.values
array([(1, 1), (1, 2), (1, 3), (1, 4)], dtype=object)
The reason that it is converting the indexes to strings is because the last index
intercept [-11.4551819018]
in your series data is a string. The documentation for the Pandas data frames state that when constructing a data frame from a series the data frame keeps the same indexing from the series which causes the conversion to all strings because of the last line in the data.
Your solution to create two data frames and then joining them works because the indexing is consistent since you're using the same data structure (e.g. data frame) rather than converting from one data structure (series) to another (data frame). It seems to be a Pandas specific thing. I would stick with your solution.
Here is how to create a DataFrame where each series is a row.
For a single Series (resulting in a single-row DataFrame):
Copyseries = pd.Series([1,2], index=['a','b'])
df = pd.DataFrame([series])
For multiple series with identical indices:
Copycols = ['a','b']
list_of_series = [pd.Series([1,2],index=cols), pd.Series([3,4],index=cols)]
df = pd.DataFrame(list_of_series, columns=cols)
For multiple series with possibly different indices:
Copylist_of_series = [pd.Series([1,2],index=['a','b']), pd.Series([3,4],index=['a','c'])]
df = pd.concat(list_of_series, axis=1).transpose()
To create a DataFrame where each series is a column, see the answers by others. Alternatively, one can create a DataFrame where each series is a row, as above, and then use df.transpose(). However, the latter approach is inefficient if the columns have different data types.
No need to initialize an empty DataFrame (you weren't even doing that, you'd need pd.DataFrame() with the parens).
Instead, to create a DataFrame where each series is a column,
- make a list of Series,
series, and - concatenate them horizontally with
df = pd.concat(series, axis=1)
Something like:
Copyseries = [pd.Series(mat[name][:, 1]) for name in Variables]
df = pd.concat(series, axis=1)
Rather than create 2 temporary dfs you can just pass these as params within a dict using the DataFrame constructor:
pd.DataFrame({'email':sf.index, 'list':sf.values})
There are lots of ways to construct a df, see the docs
to_frame():
Starting with the following Series, email_series:
email
[email protected] A
[email protected] B
[email protected] C
dtype: int64
email_series = pd.Series(["[email protected]", ...])
I use to_frame to convert the series to DataFrame:
df = email_series.to_frame().reset_index()
email 0
0 [email protected] A
1 [email protected] B
2 [email protected] C
3 [email protected] D
Now all you need is to rename the column name and name the index column:
df = df.rename(columns= {0: 'list'})
df.index.name = 'index'
Your DataFrame is ready for further analysis.
Update: I just came across this link where the answers are surprisingly similar to mine here.