Skip to content

String

String handling

Code

import string

Constants

Action Code Details
Lowercase and uppercase letters
string.ascii_letters
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
Lowercase letters
string.ascii_lowercase
abcdefghijklmnopqrstuvwxyz
Uppercase letters
string.ascii_uppercase
ABCDEFGHIJKLMNOPQRSTUVWXYZ
Digits
string.digits
0123456789
Hexadecimal digits
string.hexdigits
0123456789abcdefABCDEF
Whitespace characters
string.whitespace
Includes space, tab, linefeed, return, formfeed, and vertical tab.
Punctuation characters
string.punctuation
!"#$%&'()*+,-./:;<=>?@[]^_`{|}~
Printable characters
string.printable
Combination of digits, letters, punctuation, and whitespace.

Create

Action Code Details
Empty
''
Literal
'hello world'
Literals (concatenate)
'a' 'b' 'c'
Random lowercase string of length n
''.join(random.choices(string.ascii_lowercase + string.digits, k=n))
Random alphanumeric string of length n
''.join(random.choices(string.ascii_lowercase + string.digits, k=n))
From list, separated by comma
','.join(['a', 'b'])
Object to string
str(x)
Positional formatting
'First {0} then {1}'.format(1 + 1, 2 * 2)
Named formatting
'First {sum} then {mult}'.format(sum = 1 + 1, mult = 2 * 2)
Named element formatting
'a0 = {a[0]}'.format(a=[1,2])
Named attribute formatting
'Instance is of type: {p.type}'.format(p=Prop)
Named formatting of whole number
'a = '{num:,}'.format(num = int_var)
Dynamic formatting based on dict
'Value of a and b is {a} and {b}'.format_map(dict(a=1, b=2))
Whole number
'a = {:d}'.format(3)
Whole number with thousands separator
'a = {:,d}'.format(1000)
Outputs '1,000'
Padded whole number
'a = {:3d}'.format(3)
Outputs ' 3'
Zero-padded whole number
'a = {:03d}'.format(3)
Outputs '003'
Float
'a = {:f}'.format(3.14)
Float as whole number
'a = {:.0f}'.format(3.14)
Outputs '3'
Float with decimal-point padding
'a = {:06.2f}'.format(3.1234)
Datetime with format
'{:%Y-%m-%d %H:%M}'.format(datetime(2001, 2, 3, 4, 5))

Test

Action Code Details
Is str
type(x) is str
Is string (str or subclass)
isinstance(x, str)
Empty
not x
Not empty
x
Equal
x == y
Contains substring
substr in x
Letters only
x.isalpha()
Digits only
x.isdigit()
Alphanumeric characters only
x.isalnum()
Does not contain substring
substr not in x
Starts with
x.startswith(prefix)
Ends with
x.endswith(suffix)
Matches regex pattern
bool(re.search('\w', 'abc'))
Contains n regex substrings
len(re.findall('\w', 'a. a')) == n

Extract

Action Code Details
Number of characters (length)
len(x)
? how is this handled for unicode ?
Session hash
hash(x)
Find first index of substring
x.index(substr)
Raises error if not found
Try find index of substring
x.find(substr)
Try find last index of substring
x.rfind(substr)
Count number of non-overlapping substring occurrences
x.count(substr)
Count number of non-overlapping substring occurrences in range [n, m]
x.count(substr, n, m)

Derive

Action Code Details
Remove substring
?
Remove regex group pattern
?

Transform the string whilst preserving length

Action Code Details
Lower case
x.lower()
Upper case
x.upper()
Capitalize
x.capitalize()
Map from dict
{'yes': 'ja', 'no': 'nee'}[x]

Order

Action Code Details
Reverse characters
x[::-1]

Expand string

Action Code Details
Left-pad to length n
x.ljust(n)
Right-pad to length n
x.rjust(n)
Left-right padding to length n
x.center(n)
Replicate n times
x * n
Concatenate
x + y
Join with iterable
x.join(iter)

Substring

Action Code Details
First character
x[0]
_i_th character
x[i]
Index beyond length will raise error
Last character
x[-1]
Substring (slice)
x[2:3]
First n characters
x[:n]
Last n characters
x[-n:]
Strip leading whitespace
x.strip()
Strip leading characters
x.strip('abc')
Remove prefix
x.removeprefix(y)
Remove suffix
x.removesuffix(y)
Substring up to first occurrence of y
x.split(y)[0]
y is excluded
Substring up to first line break
x.split('\n')[0]
Line break is excluded

Combine

Action Code Details
Concatenate strings
x + y

Convert

Action Code Details
To bytes
x.encode()
Parse as integer
int(x)
Parse as float
float(x)
Parse date (unknown format)
pandas.to_datetime('2023 Jan 5')
To date from YY-MM-DD format
datetime.strptime('2023-12-31', '%Y-%m-%d')

Convert to list of substrings

Action Code Details
Split string into two parts by separator (as triplet)
x.partition(sep)
Split string into multiple parts by separator sep (as list)
x.split(sep)
Split string into lines (as list)
x.splitlines()
Split string into cumulative parts by separator sep (as list)
list(itertools.accumulate(x.split(sep), lambda x, y: sep.join([x, y])))
For sep='.', a.b.c becomes [a, a.b, a.b.c]