i think, its basic but important to you get idea. Most basic function is use re library or just use split() function on string.
a = "ini budi, iwan dan ibu budi. Selain itu ini juga ada 'madu' budi."
a.split(' ')
['ini', 'budi,', 'iwan', 'dan', 'ibu', 'budi.', 'Selain', 'itu', 'ini', 'juga', 'ada', "'madu'", 'budi.']
well, its just split on space or any other delimiter in split bracket parameter.
import re
re.split('(\W)',a)
['ini', ' ', 'budi', ',', '', ' ', 'iwan', ' ', 'dan', ' ', 'ibu', ' ', 'budi', '.', '', ' ', 'Selain', ' ', 'itu', ' ', 'ini', ' ', 'juga', ' ', 'ada', ' ', '', "'", 'madu', "'", '', ' ', 'budi', '.', '']
its different case, the splitter can be more than one delimiter. You can use Kennet's function :
def tsplit(string, delimiters):
"""Behaves str.split but supports multiple delimiters."""
delimiters = tuple(delimiters)
stack = [string,]
for delimiter in delimiters:
for i, substring in enumerate(stack):
substack = substring.split(delimiter)
stack.pop(i)
for j, _substring in enumerate(substack):
stack.insert(i+j, _substring)
return stack
tsplit(a, (',', '/', '-',' ','.','\''))
['ini', 'budi', '', 'iwan', 'dan', 'ibu', 'budi', '', 'Selain', 'itu', 'ini', 'juga', 'ada', '', 'madu', '', 'budi', '']
other search terms such as :
python split multiple delimiters
are solved with re, as simple as this :
re.split('\W+',a)
['ini', 'budi', 'iwan', 'dan', 'ibu', 'budi', 'Selain', 'itu', 'ini', 'juga', 'ada', 'madu', 'budi', '']
Python String Split
a = "ini budi, iwan dan ibu budi. Selain itu ini juga ada 'madu' budi."
a.split(' ')
['ini', 'budi,', 'iwan', 'dan', 'ibu', 'budi.', 'Selain', 'itu', 'ini', 'juga', 'ada', "'madu'", 'budi.']
well, its just split on space or any other delimiter in split bracket parameter.
Python keep include delimiter
sometimes, you just need to process the text, then you want to return back or merge again after processing. Well, here's this :import re
re.split('(\W)',a)
['ini', ' ', 'budi', ',', '', ' ', 'iwan', ' ', 'dan', ' ', 'ibu', ' ', 'budi', '.', '', ' ', 'Selain', ' ', 'itu', ' ', 'ini', ' ', 'juga', ' ', 'ada', ' ', '', "'", 'madu', "'", '', ' ', 'budi', '.', '']
Python split two characters or more delimiter
its different case, the splitter can be more than one delimiter. You can use Kennet's function :
def tsplit(string, delimiters):
"""Behaves str.split but supports multiple delimiters."""
delimiters = tuple(delimiters)
stack = [string,]
for delimiter in delimiters:
for i, substring in enumerate(stack):
substack = substring.split(delimiter)
stack.pop(i)
for j, _substring in enumerate(substack):
stack.insert(i+j, _substring)
return stack
tsplit(a, (',', '/', '-',' ','.','\''))
['ini', 'budi', '', 'iwan', 'dan', 'ibu', 'budi', '', 'Selain', 'itu', 'ini', 'juga', 'ada', '', 'madu', '', 'budi', '']
other search terms such as :
python split multiple delimiters
python split string into list delimiter
python split on multiple characters
python split on different characters
python substring delimiter
python split separator
python split include delimiter
python split string into list delimiter
python split on different characters
python split two characters
python convert string to raw
regex find text between two strings
are solved with re, as simple as this :re.split('\W+',a)
['ini', 'budi', 'iwan', 'dan', 'ibu', 'budi', 'Selain', 'itu', 'ini', 'juga', 'ada', 'madu', 'budi', '']