Course Content
String Manipulation in Python
String Manipulation in Python
Trimming
As data scientists, we may face some numerical data represented in string format. For example, birth year in format 1991 yr., or weight in format 142 lbs, etc. These units make applying math functions (like mean) impossible.
Fortunately, Python can solve the problem. There is a built-in .strip
method that returns a copy of the string by removing both the leading and the trailing characters. If we want to remove elements only on the left side, use .lstrip
, and rstrip
for symbols on the right side. All these functions by default will remove all the possible space characters.
If we want to delete certain characters, we can set them as an argument in the function. These characters should be written consecutively in one string. For example,
print("148 lbs".strip(' lbs')) print("AB string AB".lstrip('AB '))
These methods work the next way: they try to find at least one of the symbols (specified in argument) on the respective side (left for .lstrip
, right for .rstrip
, and on both sides for .strip
) and remove it until no symbol will be found.
Swipe to show code editor
Given list of strings ages
containing strings in format ___ y/o. You need to iterate over the list, removing ' y/o' endings, convert each element to an integer type, and calculate the mean.
Do not worry if you are not familiar with some pieces of code.
Note
It's impossible to calculate the mean for the original list since its elements can't be recognized as numbers.
Thanks for your feedback!
Trimming
As data scientists, we may face some numerical data represented in string format. For example, birth year in format 1991 yr., or weight in format 142 lbs, etc. These units make applying math functions (like mean) impossible.
Fortunately, Python can solve the problem. There is a built-in .strip
method that returns a copy of the string by removing both the leading and the trailing characters. If we want to remove elements only on the left side, use .lstrip
, and rstrip
for symbols on the right side. All these functions by default will remove all the possible space characters.
If we want to delete certain characters, we can set them as an argument in the function. These characters should be written consecutively in one string. For example,
print("148 lbs".strip(' lbs')) print("AB string AB".lstrip('AB '))
These methods work the next way: they try to find at least one of the symbols (specified in argument) on the respective side (left for .lstrip
, right for .rstrip
, and on both sides for .strip
) and remove it until no symbol will be found.
Swipe to show code editor
Given list of strings ages
containing strings in format ___ y/o. You need to iterate over the list, removing ' y/o' endings, convert each element to an integer type, and calculate the mean.
Do not worry if you are not familiar with some pieces of code.
Note
It's impossible to calculate the mean for the original list since its elements can't be recognized as numbers.
Thanks for your feedback!
Trimming
As data scientists, we may face some numerical data represented in string format. For example, birth year in format 1991 yr., or weight in format 142 lbs, etc. These units make applying math functions (like mean) impossible.
Fortunately, Python can solve the problem. There is a built-in .strip
method that returns a copy of the string by removing both the leading and the trailing characters. If we want to remove elements only on the left side, use .lstrip
, and rstrip
for symbols on the right side. All these functions by default will remove all the possible space characters.
If we want to delete certain characters, we can set them as an argument in the function. These characters should be written consecutively in one string. For example,
print("148 lbs".strip(' lbs')) print("AB string AB".lstrip('AB '))
These methods work the next way: they try to find at least one of the symbols (specified in argument) on the respective side (left for .lstrip
, right for .rstrip
, and on both sides for .strip
) and remove it until no symbol will be found.
Swipe to show code editor
Given list of strings ages
containing strings in format ___ y/o. You need to iterate over the list, removing ' y/o' endings, convert each element to an integer type, and calculate the mean.
Do not worry if you are not familiar with some pieces of code.
Note
It's impossible to calculate the mean for the original list since its elements can't be recognized as numbers.
Thanks for your feedback!
As data scientists, we may face some numerical data represented in string format. For example, birth year in format 1991 yr., or weight in format 142 lbs, etc. These units make applying math functions (like mean) impossible.
Fortunately, Python can solve the problem. There is a built-in .strip
method that returns a copy of the string by removing both the leading and the trailing characters. If we want to remove elements only on the left side, use .lstrip
, and rstrip
for symbols on the right side. All these functions by default will remove all the possible space characters.
If we want to delete certain characters, we can set them as an argument in the function. These characters should be written consecutively in one string. For example,
print("148 lbs".strip(' lbs')) print("AB string AB".lstrip('AB '))
These methods work the next way: they try to find at least one of the symbols (specified in argument) on the respective side (left for .lstrip
, right for .rstrip
, and on both sides for .strip
) and remove it until no symbol will be found.
Swipe to show code editor
Given list of strings ages
containing strings in format ___ y/o. You need to iterate over the list, removing ' y/o' endings, convert each element to an integer type, and calculate the mean.
Do not worry if you are not familiar with some pieces of code.
Note
It's impossible to calculate the mean for the original list since its elements can't be recognized as numbers.