Как вырезать часть строки в python
Перейти к содержимому

Как вырезать часть строки в python

  • автор:

Python 3: Строки. Функции и методы строк¶

Определение позиции подстроки в строке с помощью функций str.find и str.rfind .

Функция str.find показывает первое вхождение подстроки. Все позиции возвращаются относительно начало строки.

Можно определить вхождение в срезе. первое число показывает начало среза, в котором производится поиск. Второе число — конец среза. В случае отсутствия вхождения подстроки выводится -1.

Функция str.rfind осуществляет поиск с конца строки, но возвращает позицию подстроки относительно начала строки.

Python: Извлекаем имя файла из URL¶

Понадобилось мне отрезать от URL всё, что находится после последнего слэша, т.е.названия файла. URL можеть быть какой угодно. Знаю, что задачу запросто можно решить с помощью специального модуля, но я хотел избежать этого. Есть, как минимум, два способа справиться с поставленным вопросом.

Способ №1¶

Достаточно простой способ. Разбиваем строку по слэшам с помощью функции split() , которая возвращает список. А затем из этого списка извлекаем последний элемент. Он и будет названием файла.

Повторим шаг с присвоением переменной:

Способ №2¶

Второй способ интереснее. Сначала с помощью функции rfind() находим первое вхождение с конца искомой подстроки. Функция возвращает позицию подстроки относительно начала строки. А далее просто делаем срез.

Remove Substring From String in Python

While handling text data in python, we sometimes need to remove a specific substring from the text. In this article, we will discuss different ways to remove a substring from a string in Python.

Remove Substring From String in Python Using split() Method

The split() method in Python is used to split a string into substrings at a separator. The split() method, when invoked on a string, takes a string in the form of a separator as its input argument. After execution, it returns a list of substrings from the original string, which is split at the separator.

To remove a substring from a string in Python using the split() method, we will use the following steps.

  • First, we will create an empty string named output_string to store the output string.
  • Then, we will use the split() method to split the string into substrings from the positions where we need to remove a specific substring. For this, we will invoke the split() method on the input string with the substring that needs to be removed as the input argument. After execution, the split() method will return a string of substrings. We will assign the list to a variable str_list .
  • Once we get the list of strings, we will iterate through the substrings in str_list using a for loop. During iteration, we will add the current substring to output_string using the string concatenation operation.

After execution of the for loop, we will get the required output string in the variable output_string . You can observe this in the following code.

In the output, you can observe that the substring python has been removed from the input string.

Remove Substring From String in Python Using Using the join() Method

Performing string concatenation several times requires unnecessary storage and time. Therefore, we can avoid that by using the join() method.

The join() method, when invoked on a separator string, takes an iterable object as its input argument. After execution, it returns a string consisting of the elements of the iterable object separated by the separator string.

To remove substring from a string in python using the join() method, we will use the following steps.

  • First, we will use the split() method to split the input string into substrings from the positions where we need to remove a specific substring. For this, we will invoke the split() method on the input string with the substring that needs to be removed as the input argument. After execution, the split() method will return a string of substrings. We will assign the list to a variable str_list .
  • Next, we will invoke the join() method on an empty string with str_list as its input argument.

After execution of the join() method, we will get the required string output as shown below.

Here, you can observe that we have converted the list returned by the split() method into a string using the join() method. Thus, we have avoided repeated string concatenation as we did in the previous example.

Remove Substring From String in Python Using the replace() Method

The replace() method is used to replace one or more characters from a string in python. When invoked on a string, the replace() method takes two substrings as its input argument. After execution, it replaces the substring in the first argument with that of the second input argument. Then it returns the modified string.

To remove a substring from a string using the replace() method, we will invoke the replace() method on the original string with the substring that is to be removed as the first input argument and an empty string as the second input argument.

After execution of the replace() method, we will get the output string as shown in the following example.

Here, we have removed the required substring from the input string in a single statement using the replace() method.

Remove Substring From String in PythonUsing Regular Expressions

Regular expressions provide us with efficient ways to manipulate strings in Python. We can also use regular expressions to remove a substring from a string in python. For this, we can use the re.split() method and the re.sub() method.

Remove Substring From String in Python Using re.split() Method

The re.split() method is used to split a text at a specified separator. The re.split() method takes a separator string as its first input argument and the text string as its second input argument. After execution, it returns a list of strings from the original string that are separated by the separator.

To remove a substring from a string in Python using the re.split() method, we will use the following steps.

  • First, we will create an empty string named output_string to store the output string.
  • Then, we will use the re.split() method to split the string into substrings from the positions where we need to remove a specific substring. For this, we will execute the re.split() method with the substring that needs to be removed as its first input argument and the text string as its second input argument. After execution, the re.split() method will return a string of substrings. We will assign the list to a variable str_list .
  • Once we get the list of strings, we will iterate through the substrings in str_list using a for loop. During iteration, we will add the current substring to output_string using the string concatenation operation.

After execution of the for loop, we will get the required output string in the variable output_string . You can observe this in the following code.

You can observe that the approach using the re.split() method is almost similar to the approach using the string split() method. However, both approaches have different execution speeds. If the input string is very large, the re.split() method should be the preferred choice to split the input string.

Performing string concatenation several times requires unnecessary memory and time. Therefore, we can avoid that by using the join() method.

To remove substring from a string in python using the join() method, we will use the following steps.

  • First, we will use the re.split() method to split the input string into substrings from the positions where we need to remove a specific substring.For this, we will execute the re.split() method with the substring that has to be removed as its first input argument and the text string as its second input argument. After execution, the re.split() method will return a string of substrings. We will assign the list to a variable str_list .
  • Next, we will invoke the join() method on an empty string with str_list as its input argument.

After execution of the join() method, we will get the required string output as shown below.

In this approach, we have obtained the output string in only two python statements. Also, we haven’t done repetitive string concatenation which takes unnecessary time.

Remove Substring From String in Python Using re.sub() Method

The re.sub() method is used to substitute one or more characters from a string in python. The re.sub() method takes three input arguments. The first input argument is the substring that needs to be substituted. The second input argument is the substitute substring. The original string is passed as the third input string.

After execution, the re.sub() method replaces the substring in the first argument with that of the second input argument. Then it returns the modified string.

To remove a substring from a string using the re.sub() method, we will execute the re.sub() method with the substring that is to be removed as the first input argument, an empty string as the second input argument, and the original string as the third input argument.

After execution of the re.sub() method, we will get the output string as shown in the following example.

The re.sub() method works in a similar manner to the replace() method. However, it is faster than the latter and should be the preferred choice.

Remove Substring From String in Python by Index

Sometimes, we might need to remove a substring from a string when we know its position in the string. To remove a substring from a string in python by index, we will use string slicing.

If we have to remove the substring from index i to j, we will make two slices of the string. The first slice will be from index 0 to i-1 and the second slice will be from index j+1 to the last character.

After obtaining the slices, we will concatenate the slices to obtain the output string as shown in the following example.

Conclusion

In this article, we have discussed different ways to remove a substring from a string in Python. Out of all the approaches, the approaches using re.sub() method and the replace() method have the best time complexity. Therefore, I would suggest you use these approaches in your program.

I hope you enjoyed reading this article. To learn more about python programming, you can read this article on how to remove all occurrences of a character in a list in Python. You might also like this article on how to check if a python string contains a number.

Related

Recommended Python Training

Course: Python 3 For Beginners

Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.

Узнайте, какие встроенные методы Python используются в строковых последовательностях

Андрей Шагин

Строка — это последовательность символов. Встроенный строковый класс в Python представлен строками, использующими универсальный набор символов Unicode. Строки реализуют часто встречающуюся последовательность операций в Python наряду с некоторыми дополнительными методами, которые больше нигде не встречаются. На картинке ниже показаны все эти методы:

Давайте узнаем, какие используются чаще всего. Важно заметить, что все строковые методы всегда возвращают новые значения, не меняя исходную строку и не производя с ней никаких действий.

Код для этой статьи можно взять из соответствующего репозитория Github Repository.

1. center( )

Метод center() выравнивает строку по центру. Выравнивание выполняется с помощью заданного символа (пробела по умолчанию).

How do I remove a substring from the end of a string?

strip doesn’t mean "remove this substring". x.strip(y) treats y as a set of characters and strips any characters in that set from both ends of x .

On Python 3.9 and newer you can use the removeprefix and removesuffix methods to remove an entire substring from either side of the string:

The relevant Python Enhancement Proposal is PEP-616.

On Python 3.8 and older you can use endswith and slicing:

Steef's user avatar

If you are sure that the string only appears at the end, then the simplest way would be to use ‘replace’:

Since it seems like nobody has pointed this on out yet:

This should be more efficient than the methods using split() as no new list object is created, and this solution works for strings with several dots.

Géry Ogam's user avatar

Starting in Python 3.9 , you can use removesuffix instead:

Xavier Guihot's user avatar

Depends on what you know about your url and exactly what you’re tryinh to do. If you know that it will always end in ‘.com’ (or ‘.net’ or ‘.org’) then

is the quickest solution. If it’s a more general URLs then you’re probably better of looking into the urlparse library that comes with python.

If you on the other hand you simply want to remove everything after the final ‘.’ in a string then

will work. Or if you want just want everything up to the first ‘.’ then try

If you know it’s an extension, then

This works equally well with abcdc.com or www.abcdc.com or abcdc.[anything] and is more extensible.

On any Python version:

or the one-liner:

For urls (as it seems to be a part of the topic by the given example), one can do something like this:

Both will output: (‘http://www.stackoverflow’, ‘.com’)

This can also be combined with str.endswith(suffix) if you need to just split «.com», or anything specific.

DSCLAIMER This method has a critical flaw in that the partition is not anchored to the end of the url and may return spurious results. For example, the result for the URL "www.comcast.net" is "www" (incorrect) instead of the expected "www.comcast.net". This solution therefore is evil. Don’t use it unless you know what you are doing!

This is fairly easy to type and also correctly returns the original string (no error) when the suffix ‘.com’ is missing from url .

Assuming you want to remove the domain, no matter what it is (.com, .net, etc). I recommend finding the . and removing everything from that point on.

Here I’m using rfind to solve the problem of urls like abcdc.com.net which should be reduced to the name abcdc.com .

If you’re also concerned about www. s, you should explicitly check for them:

The 1 in replace is for strange edgecases like www.net.www.com

If your url gets any wilder than that look at the regex answers people have responded with.

Xavier Guay's user avatar

If you mean to only strip the extension:

It works with any extension, with potential other dots existing in filename as well. It simply splits the string as a list on dots and joins it without the last element.

Xavier Guihot's user avatar

Dcs's user avatar

If you need to strip some end of a string if it exists otherwise do nothing. My best solutions. You probably will want to use one of first 2 implementations however I have included the 3rd for completeness.

For a constant suffix:

For a collection of constant suffixes the asymptotically fastest way for a large number of calls:

the final one is probably significantly faster in pypy then cpython. The regex variant is likely faster than this for virtually all cases that do not involve huge dictionaries of potential suffixes that cannot be easily represented as a regex at least in cPython.

In PyPy the regex variant is almost certainly slower for large number of calls or long strings even if the re module uses a DFA compiling regex engine as the vast majority of the overhead of the lambda’s will be optimized out by the JIT.

In cPython however the fact that your running c code for the regex compare almost certainly outweighs the algorithmic advantages of the suffix collection version in almost all cases.

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *