Rather than create 2 temporary dfs you can just pass these as params within a dict using the DataFrame constructor:
pd.DataFrame({'email':sf.index, 'list':sf.values})
There are lots of ways to construct a df, see the docs
Answer from EdChum on Stack OverflowRather than create 2 temporary dfs you can just pass these as params within a dict using the DataFrame constructor:
pd.DataFrame({'email':sf.index, 'list':sf.values})
There are lots of ways to construct a df, see the docs
to_frame():
Starting with the following Series, email_series:
email
[email protected] A
[email protected] B
[email protected] C
dtype: int64
email_series = pd.Series(["[email protected]", ...])
I use to_frame to convert the series to DataFrame:
df = email_series.to_frame().reset_index()
email 0
0 [email protected] A
1 [email protected] B
2 [email protected] C
3 [email protected] D
Now all you need is to rename the column name and name the index column:
df = df.rename(columns= {0: 'list'})
df.index.name = 'index'
Your DataFrame is ready for further analysis.
Update: I just came across this link where the answers are surprisingly similar to mine here.
python - How to convert single-row pandas data frame to series? - Stack Overflow
Why does Pandas convert my DataFrame into a DataFrame and a Series inside a for loop?
[Pandas Series] Using series as a column; how is this logical?
Start with an empty pandas dataframe/series, and append using a loop
Try not to do this. The way that pandas stores its data makes appending row-by-row like this very inefficient. Instead, consider making a separate dataframe for each tab, and then concatenating them together in one go. If you have to build up row-by-row, consider doing so in a native data structure like a list or a dict, and then converting the whole thing to a dataframe at the end.
More on reddit.comVideos
You can transpose the single-row dataframe (which still results in a dataframe) and then squeeze the results into a series (the inverse of to_frame).
Copydf = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])
>>> df.squeeze(axis=0)
a0 0
a1 1
a2 2
a3 3
a4 4
Name: 0, dtype: int64
Note: To accommodate the point raised by @IanS (even though it is not in the OP's question), test for the dataframe's size. I am assuming that df is a dataframe, but the edge cases are an empty dataframe, a dataframe of shape (1, 1), and a dataframe with more than one row in which case the use should implement their desired functionality.
Copyif df.empty:
# Empty dataframe, so convert to empty Series.
result = pd.Series()
elif df.shape == (1, 1)
# DataFrame with one value, so convert to series with appropriate index.
result = pd.Series(df.iat[0, 0], index=df.columns)
elif len(df) == 1:
# Convert to series per OP's question.
result = df.T.squeeze()
else:
# Dataframe with multiple rows. Implement desired behavior.
pass
This can also be simplified along the lines of the answer provided by @themachinist.
Copyif len(df) > 1:
# Dataframe with multiple rows. Implement desired behavior.
pass
else:
result = pd.Series() if df.empty else df.iloc[0, :]
It's not smart enough to realize it's still a "vector" in math terms.
Say rather that it's smart enough to recognize a difference in dimensionality. :-)
I think the simplest thing you can do is select that row positionally using iloc, which gives you a Series with the columns as the new index and the values as the values:
Copy>>> df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])
>>> df
a0 a1 a2 a3 a4
0 0 1 2 3 4
>>> df.iloc[0]
a0 0
a1 1
a2 2
a3 3
a4 4
Name: 0, dtype: int64
>>> type(_)
<class 'pandas.core.series.Series'>
arms_bayes_list = pandas.DataFrame(
{'arms': pandas.Series(arms_bayes), 'priori_mean': pandas.Series(arm_priori_mean_vectorized(arms_bayes)),
'variance_square': pandas.Series(arm_variance_square_vectorized(arms_bayes)),
'posterior_mean': pandas.Series(arm_posterior_mean_vectorized(arms_bayes)),
'posterior_variance_square': pandas.Series(arm_posterior_variance_square_vectorized(arms_bayes)),
'empirical_mean': pandas.Series(arm_empirical_mean_vectorized(arms_bayes)),
'mean': pandas.Series(arm_mean_vectorized(arms_bayes)),
'priori_variance_square': pandas.Series(arm_variance_square_vectorized(arms_bayes))})
optimal_mean_bayes = numpy.amax(arms_bayes_list[["mean"]])
print(type(arms_bayes_list))
# This gives a datatype <class'pandas.core.frame.DataFrame'>
for round_no in range(int(no_of_rounds)):
print(type(arms_bayes_list))
# This gives a datatype <class'pandas.core.frame.DataFrame'>
<class'pandas.core.series.Series'>Iterating over the data frame I get the error that Series has no attribute called 'itertuples'/ 'iterrow' so I really can't iterate over the DataFrame.
Well, this is rather mysterious to me. For some reason the type of the DataFrame is different inside the for loop than what it is outside the for loop. I don't think I've made any mistakes but I another set of eyes would definitely help.