Blog

Python Substring

  • (5.0)
  • | 2548 Ratings

Python Substring


Python is an interpreted, high-level, object-oriented programming language whose built-in data structures are high-level, and merged with dynamic binding and dynamic typing. This makes it very attractive for the Rapid Application Development. Python encourages program modularity by supporting modules, packages, and code reuse. In this article, we will specifically look at the topic of substrings and play around with many Python substring examples, and especially, the concept of Slicing in Python.


 

Strings in Python

In Python, Strings can be defined as arrays of bytes representing Unicode characters. Since Python does not have a character data type, a single character is actually a string with a length of 1. The elements of the string can be accessed with the help of square brackets.

Python strings are "immutable" which means once created, they cannot be changed (Java strings also follow this immutable style). As the strings cannot be changed, we construct *new* strings as we need to represent computed values. For example, the expression ('hello' + 'there') takes in the 2 strings 'hello' and 'there' and constructs a new string 'hellothere'.

Let us briefly take a look at the following topics related to Python strings and then proceed to Python substrings.


Creating a String


Strings in Python can be created with the help of either single quotes or double quotes, or even triple quotes.

String in single quotes cannot hold any other single quoted character in it since an error arises because the compiler won’t be able recognize where to start and end the string. In order to overcome this error, use of double quotes is preferred, as this helps in creation of Strings with single quotes in them. For strings which contain Double quoted words in them, using triple quotes is suggested. Along with this, triple quotes also help in the creation of multiline strings.

Unlike Java, the '+' does not automatically convert numbers or other types to the form of a string. The str() function helps in converting values to a string form so that they can be combined with other strings.

The "print" operator prints out one or more python items immediately followed by a newline (which leaves a trailing comma at the end of the items to inhibit the newline).


Let us consider an example of String Creation

# Python Program for Creation of String
# Creating a String  
# with single Quotes

String1 = 'Welcome to the Geeks World'
print("String with the use of Single Quotes: ")
print(String1)

# Creating a String
# with double Quotes

String1 = "I'm a Geek"
print("nString with the use of Double Quotes: ")
print(String1)
 
# Creating a String
# with triple Quotes

String1 = '''I'm a Geek. I live in the world of "Geeks"'''
print("nString with the use of Triple Quotes: ")
print(String1)
 
# Creating String with triple
# Quotes allows multiple lines

String1 = '''Geeks
            For
            Life'''
print("nCreating a multiline String: ")
print(String1)


The output of the above program would be

String with the use of Single Quotes:
Welcome to the Geeks World

String with the use of Double Quotes:
I'm a Geek

String with the use of Triple Quotes:
I'm a Geek. I live in the world of "Geeks"

Creating a multiline String:
Geeks
            For
            Life


Accessing characters in Python


In Python, individual characters of a String can be accessed with the help of the method of Indexing, and in order to access a range of characters in the String, the method of slicing is used. Slicing in a String is done with the help of a Slicing operator (colon). In Indexing, one can use negative address references to access characters from the back of the String, e.g. -1 refers to the end character, -2 refers to the penultimate character, and so on.

If one tries to access an index out of the range, it will cause an IndexError. Only Integers are allowed to be passed as an index since float or other types will cause a TypeError.


Deleting/Updating from a String


In Python, there is no provision of Updation or deletion of characters from a String. This will result in an error because item assignment or item deletion from a String is not supported. Deletion of entire String is possible by using a built-in del keyword. As mentioned before, since Python Strings are immutable, elements of a String cannot be changed once it has been assigned. Only new strings can be reassigned to the strings of same name.


Escape Sequencing in Python


While printing Strings with single and double quotes in it, SyntaxError is caused because String already contains Single and Double Quotes, and hence cannot be printed with the use of either of these. In order to print such kind of Strings, either Triple Quotes or Escape sequences are used.
Escape sequences begin with a backslash and can be interpreted differently.

If single quotes are used to represent a string, then all the single quotes present in the string must be escaped and same is applicable for Double Quotes. To ignore the escape sequences in a String, r or R is used, implying that the string is a raw string and escape sequences inside it are meant to be ignored.

 

Python String Methods


A method is similar to a function, but it runs "on" an object. If the variable s is considered as a string, then the code s.lower() runs the lower() method on that string object and then returns the result (this concept of a method running on an object is one of the basic ideas that make up Object Oriented Programming, OOP)

Python has quite a few methods that string objects can call in order to perform frequently occurring tasks (related to string). For example, if you require the the first letter of a string to be capitalized, you can use capitalize() method. Below are all methods of string objects. Also, all built-in functions that can take string as a parameter and perform some task are included.


Method Description
Python String capitalize() Converts first character to Capital Letter
Python String center() Pads string with specified character
Python String casefold() converts to casefolded strings
Python String count() returns occurrences of substring in string
Python String endswith() Checks if String Ends with the Specified Suffix
Python String expandtabs() Replaces Tab character With Spaces
Python String encode() returns encoded string of given string
Python String find() Returns the index of first occurrence of a Python substring in string
Python String format() formats string into nicer output
Python String index() Returns Index of the Python Substring in string
Python String isalnum() Checks Alphanumeric Character
Python String isalpha() Checks if All Characters are Alphabets
Python String isdecimal() Checks Decimal Characters
Python String isdigit() Checks Digit Characters
Python String isidentifier() Checks for Valid Identifier
Python String islower() Checks if all Alphabets in a String are Lowercase
Python String isnumeric() Checks Numeric Characters
Python String isprintable() Checks Printable Character
Python String isspace() Checks Whitespace Characters
Python String istitle() Checks for Titlecased String
Python String isupper() returns if all characters are uppercase characters
Python String join() Returns a Concatenated String
Python String ljust() returns left-justified string of given width
Python String rjust() returns right-justified string of given width
Python String lower() returns lowercased string
Python String upper() returns uppercased string
Python String swapcase() swap uppercase characters to lowercase; vice versa
Python String lstrip() Removes Leading Characters
Python String rstrip() Removes Trailing Characters
Python String strip() Removes Both Leading and Trailing Characters
Python String partition() Returns a Tuple
Python String maketrans() returns a translation table
Python String rpartition() Returns a Tuple
Python String translate() returns mapped charactered string
Python String replace() Replaces Substring Inside
Python String rfind() Returns the Highest Index of the Python Substring in String
Python String rindex() Returns Highest Index of Python Substring in String
Python String split() Splits String from Left
Python String rsplit() Splits String From Right
Python String splitlines() Splits String at Line Boundaries
Python String startswith() Checks if String Starts with the Specified String
Python String title() Returns a Title Cased String
Python String zfill() Returns a Copy of The String Padded With Zeros
Python String format_map() Formats the String Using Dictionary
Python any() Checks if any Element of an Iterable is True
Python all() returns true when all elements in iterable is true
Python ascii() Returns String Containing Printable Representation
Python bool() Converts a Value to Boolean
Python bytearray() returns array of given byte size
Python bytes() returns immutable bytes object
Python compile() Returns a Python code object
Python complex() Creates a Complex Number
Python enumerate() Returns an Enumerate Object
Python filter() constructs iterator from elements which are true
Python float() returns floating point number from number, string
Python input() reads and returns a line of string
Python int() returns integer from a number or string
Python iter() returns iterator for an object
Python len() Returns Length of an Object
Python max() returns largest element
Python min() returns smallest element
Python map() Applies Function and Returns a List
Python ord() returns Unicode code point for Unicode character
Python reversed() It returns the reversed iterator of a given sequence
Python slice() creates a slice object specified by range
Python sorted() returns sorted list from a given iterable
Python sum() Add items of an Iterable
Python zip() Returns an Iterator of Tuples

Slicing in Python

Now that we have looked at various string methods in Python, let us look at slicing in particular.
The slice() constructor generates a slice object which represents the set of indices specified by range(start, stop, step).

The slice object helps in slicing a given sequence (string, bytes, tuple, list or range) or any object which supports the sequence protocol (implements __getitem__() and __len__() method).

The syntax of slice() is:
slice(stop)
slice(start, stop, step)


slice() Parameters

slice() chiefly takes three parameters which have the same meaning in both the constructs:
start - is the starting integer where the slicing of the object starts
stop - is the integer until which the slicing takes place. The slicing stops at index stop - 1.
step - is the integer value which determines the increment between each index for slicing
If a single parameter has been passed, start and stop are set to None.

Return value from slice()

slice() returns a slice object which is used to slice a sequence in the given indices.

Let us now consider an example of creating a slice object for slicing in Python.

# contains indices (0, 1, 2)
print(slice(3))

# contains indices (1, 3)
print(slice(1, 5, 2))

The output of the above program when you run it will be

slice(None, 3, None)
slice(1, 5, 2)


Getting a Sub-string From a String in Python – Slicing Strings


Suppose you have a Python string zombie publisher and you would like to get bi publisher substring, or you have a python string paqlikview and you would like to get the qlikview substring. So, how does one get a sub-string in Python? Well, Python has a very useful feature called "slicing" that can be used for getting sub-strings from strings. But, before that, we have to go over a couple of things to understand how this works.


Slicing Python Objects

In Python, strings are arrays of characters, except they behave a bit different compared to arrays. However, for the most part, they can be treated as arrays. With the help of this information, we can use Python's array functionality, known as "slicing", on our strings! Slicing can be used on to any array-type object in Python.

Okay, so here's a good example using a simple array to start off.

>>> a = [1,2,3,4,5]
>>> # This is slicing!
>>> a[:3]
[1, 2, 3]

As you can observe, this outputs us a subset of the array up to the 3rd element. Slicing takes two "arguments" which specify the start and end position you would like in your array.

The Syntax is : array[start:end]

If you consider our example above, in case we only wanted the elements 2 and 3, we would do the following:
>>> a[1:3]
[2, 3]

Alright!!! You might be wondering what does this have to do with Python substrings?


Getting Sub-strings: Slicing Python Strings!


Previously, we mentioned that we can pretty much treat strings like arrays. That means, the same logic can be applied to our strings!

Here's an example:
>>> s = 'Hello, everybody!'
>>> s[0]
'H'
>>> s[:3]
'Hel'
>>> s[2:5]
'llo'

Wow! We now have accessed the character just like it was an element in an array! Awesome!

What we have just observed here is a "sub-string". In order to get a sub-string from a string, it's as simple as inputting the desired start position of the string and also the desired end position. Of course, we can also omit either the start position or the end position which will then convey to Python that we would like to either start our sub-string from the start, or end it at the end, respectively.

Here's another example using our string above.

>>> # Start from the beginning and go to character 5
>>> s[:5]
'Hello'
>>> # Start from character 5 and go to the end
>>> s[5:]
', everybody!'
>>> s[:]
'Hello, everybody!'

Alright, that's all well and good, but what did we do in that last line? Although we didn't specify the start or the end, how come it still worked?

Well, what we did there was convey Python to get from the start element all the way to the end element. It's a completely valid statement. A new copy of the string or array will also be created. You could utilize this in case you need a copy of the original to modify!


Reverse Sub-string Slicing in Python


Another example, using extended slicing, we can also get the sub-string in reverse order. Though, this is just a "for fun" example, but if you are ever required to reverse a string in Python, or get the reversed sub-string of a string, this could definitely help.

>>> s[::-1]
'!ydobyreve ,olleH'
>>> s[4::-1]
'olleH'


Conclusion:


Since it's out of the scope of this article to elucidate more on extended slicing, we won't say too much about it. The only part different is the extra colon at the end and the number following it. The extra colon conveys to Python that this is an extended slice, and the "-1" is the index to use when the string is being traversed. Even when we put a "1" where the "-1" is, we'd get the same result as before.

There you go! It's really easy to get sub-strings in Python, and we hope you are confident getting them!

Explore Python Sample Resumes! Download & Edit, Get Noticed by Top Employers!Download Now!



Subscribe For Free Demo

Free Demo for Corporate & Online Trainings.

Anjaneyulu Naini
About The Author

Anjaneyulu Naini is working as a Content contributor for Mindmajix. He has a great understanding of today’s technology and statistical analysis environment, which includes key aspects such as analysis of variance and software,. He is well aware of various technologies such as Python, SAS, Artificial Intelligence, Oracle, Business Intelligence, Altrex etc, Connect with him on LinkedIn and Twitter.


DMCA.com Protection Status

Close
Close