Python doesn't have arrays like in C++ or Java. Instead you can use a list.
my_list = ['hello','hi','hurra']
for text in my_list:
print(text)
There are multiple ways to iterate the list, e.g.:
for i, text in enumerate(my_list):
print(i, text)
for i in range(0, len(my_list)):
print(my_list[i])
You could use it like "arrays", but there are many build-in functions that you would find very useful. This is a basic and essential data structure in Python. Most of books or tutorials would cover this topic in the first few chapters.
P.S.: The codes are for Python 3.
Answer from ioannu on Stack OverflowPython doesn't have arrays like in C++ or Java. Instead you can use a list.
my_list = ['hello','hi','hurra']
for text in my_list:
print(text)
There are multiple ways to iterate the list, e.g.:
for i, text in enumerate(my_list):
print(i, text)
for i in range(0, len(my_list)):
print(my_list[i])
You could use it like "arrays", but there are many build-in functions that you would find very useful. This is a basic and essential data structure in Python. Most of books or tutorials would cover this topic in the first few chapters.
P.S.: The codes are for Python 3.
You can use dictionaries for this purpose.
mydict={'i1':'hello','i2':'hi','i3':'hurra'}
for i, (key, value) in enumerate(mydict.items()):
print("index: {}, key: {}, value: {}".format(i, key, value))
Videos
Can a 'for loop' in Python be nested within another 'for loop'?
How can I iterate over a list using a 'for loop' in Python?
How do I use a 'for loop' to iterate over a dictionary in Python?
You may achieve what you need by replacing all non-letters first, then extracting pairs of letters and then applying some custom logic to extract the necessary value from the array:
>>> df['array_column'].str.replace('[^A-Z]+', '').str.findall('([A-Z]{2})').apply(lambda d: [''] if len(d) == 0 else d).apply(lambda x: 'HL' if len(x) == 1 and x[0] == 'HL' else [m for m in x if m != 'HL'][0])
0 HL
1 PG
2 PG
3 RC
Name: array_column, dtype: object
>>>
Details
.replace('[^A-Z]+', '')- remove all chars other the uppercase letters.str.findall('([A-Z]{2})')- extract pairs of letters.apply(lambda d: [''] if len(d) == 0 else d)will add an empty item if there is no regex match in the previous step.apply(lambda x: 'HL' if len(x) == 1 and x[0] == 'HL' else [m for m in x if m != 'HL'][0])- custom logic: if the list length is 1 and it is equal toHL, keep it, else remove allHLand get the first element
This is one approach using apply
Demo:
import re
import pandas as pd
def checkValue(value):
value = re.findall(r"[A-Z]{2}", value)
if (len(value) > 1) and ("HL" in value):
return [i for i in value if i != "HL"][0]
else:
return value[0]
df = pd.DataFrame({"column1": ["HL111", "PG3939HL11", "HL339PG", "RC--HL--PG"]})
print(df.column1.apply(checkValue))
Output:
0 HL
1 PG
2 PG
3 RC
Name: column1, dtype: object