You have an array of bytestrings; dtype is S:
In [338]: arr=np.array((b'first_element', b'element'))
In [339]: arr
Out[339]:
array([b'first_element', b'element'],
dtype='|S13')
astype easily converts them to unicode, the default string type for Py3.
In [340]: arr.astype('U13')
Out[340]:
array(['first_element', 'element'],
dtype='<U13')
There is also a library of string functions - applying the corresponding str method to the elements of a string array
In [341]: np.char.decode(arr)
Out[341]:
array(['first_element', 'element'],
dtype='<U13')
The astype is faster, but the decode lets you specify an encoding.
See also How to decode a numpy array of dtype=numpy.string_?
Answer from hpaulj on Stack OverflowYou have an array of bytestrings; dtype is S:
In [338]: arr=np.array((b'first_element', b'element'))
In [339]: arr
Out[339]:
array([b'first_element', b'element'],
dtype='|S13')
astype easily converts them to unicode, the default string type for Py3.
In [340]: arr.astype('U13')
Out[340]:
array(['first_element', 'element'],
dtype='<U13')
There is also a library of string functions - applying the corresponding str method to the elements of a string array
In [341]: np.char.decode(arr)
Out[341]:
array(['first_element', 'element'],
dtype='<U13')
The astype is faster, but the decode lets you specify an encoding.
See also How to decode a numpy array of dtype=numpy.string_?
If you want the result to be a (Python) list of strings, you can use a list comprehension:
>>> l = [el.decode('UTF-8') for el in array1]
>>> print(l)
['element', 'element 2']
>>> print(type(l))
<class 'list'>
Alternatively, if you want to keep it as a Numpy array, you can use np.vectorize to make a vectorized decoder function:
>>> decoder = np.vectorize(lambda x: x.decode('UTF-8'))
>>> array2 = decoder(array1)
>>> print(array2)
['element' 'element 2']
>>> print(type(array2))
<class 'numpy.ndarray'>
https://imgur.com/gallery/yAdAjdx
Hello,
Linked is the screenshot of the two error messages I keep recieving as well as the code leading up to it. I'm trying to run an uplift on a classification tree. Any help is appreciated, I am still fairly new to python. Thank you!!
In Python3, I have the follow numpy array of strings.
Each string in the numpy array is in the form b'MD18EEinstead ofMD18EE`.
e.g.
import numpy as np print(array1) (b'first_element', b'element',...)
Normally, one would use .decode('UTF-8') to decode these elements. However, if I try
array1 = array1.decode('UTF-8')I get the following error:
AttributeError: 'numpy.ndarray' object has no attribute 'decode'
How do I decode these elements from a numpy array? (That is, I don't want b'')
The problem is that train_test_split(X, y, ...) returns numpy arrays and not pandas dataframes. Numpy arrays have no attribute named columns
If you want to see what features SelectFromModel kept, you need to substitute X_train (which is a numpy.array) with X which is a pandas.DataFrame.
selected_feat= X.columns[(sel.get_support())]
This will return a list of the columns kept by the feature selector.
If you wanted to see how many features were kept you can just run this:
sel.get_support().sum() # by default this will count 'True' as 1 and 'False' as 0
because this :
X = df.iloc[:,:24481].values
y = df.iloc[:, -1].values
you should remove .values or make extra X_col, y_col like that
X_col = df.iloc[:,:24481]
y_col = df.iloc[:, -1]