Emacs, scripting and anything text oriented.

String Functions: Nim vs Python

While learning the Nim language and trying to correlate that with my Python 3 knowledge, I came across this awesome comparison table of string manipulation functions between the two languages.

My utmost gratitude goes to the developers of Nim, Python, Org, ob-nim and ob-python, and of course Hugo which allowed me to publish my notes in this presentable format.

Here are the code samples and their outputs. In each mini-section below, you will find a Python code snippet, followed by its output, and then the same implementation in Nim, followed by the output of that.

The tool versions used are:

  • Python 3.6.2
  • Nim 0.18.1 [Linux: amd64] git hash: 50039422 (devel branch)

Updates #

<2017-12-13 Wed>
Update the Nim snippets output using the improved echo! The echo output difference is notable in the .split examples. This fixes the issue about confusing echo outputs that I raised in Nim Issue #6225. Big thanks to Fabian Keller for Nim PR #6825!
<2017-11-29 Wed>
Update the Understanding the ^N syntax example that gave incorrect output before Nim Issue #6223 got fixed.
<2017-11-28 Tue>
Update the .join example that did not work before Nim Issue #6210 got fixed.
<2017-11-27 Mon>
Use the binary operator ..< instead of the combination of binary operator .. and the deprecated unary < operator.

String slicing #

All characters except last #

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str[:-1])
a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# Always add a space around the .. and ..< operators
echo str[0 ..< str.high]
# or
echo str[0 .. ^2]
# or
echo str[ .. ^2]
a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy
a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy
a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy

Understanding the ^N syntax #

var str = "abc"
# Always add a space around the .. and ..< operators
echo "1st char(0) to last, including \\0(^0) : ", str[0 .. ^0] # Interestingly, this also prints the NULL character in the output.. looks like "abc^@" in Emacs
echo "1st char(0) to last       (^1) «3rd»  : ", str[0 .. ^1]
echo "1st char(0) to 2nd-to-last(^2) «2nd»  : ", str[0 .. ^2]
echo "1st char(0) to 3rd-to-last(^3) «1st»  : ", str[0 .. ^3]
echo "1st char(0) to 4th-to-last(^4) «0th»  : ", str[0 .. ^4]
# echo "1st char(0) to 4th-to-last(^4) «0th»  : ", str[0 .. ^5] # Error: unhandled exception: value out of range: -1 [RangeError]
# echo "2nd char(1) to 4th-to-last(^4) «0th»  : ", str[1 .. ^4] # Error: unhandled exception: value out of range: -1 [RangeError]
echo "2nd char(1) to 3rd-to-last(^3) «1st»  : ", str[1 .. ^3]
echo "2nd char(1) to 2nd-to-last(^2) «2nd»  : ", str[1 .. ^2]
echo "2nd char(1) to last,      (^1) «3rd»  : ", str[1 .. ^1]
echo "Now going a bit crazy .."
echo " 2nd-to-last(^2) «2nd» char to 3rd(2)         : ", str[^2 .. 2]
echo " 2nd-to-last(^2) «2nd» char to last(^1) «3rd» : ", str[^2 .. ^1]
echo " 3rd-to-last(^3) «1st» char to 3rd(2)         : ", str[^3 .. 2]
1st char(0) to last, including \0(^0) : abc
1st char(0) to last       (^1) «3rd»  : abc
1st char(0) to 2nd-to-last(^2) «2nd»  : ab
1st char(0) to 3rd-to-last(^3) «1st»  : a
1st char(0) to 4th-to-last(^4) «0th»  :
2nd char(1) to 3rd-to-last(^3) «1st»  :
2nd char(1) to 2nd-to-last(^2) «2nd»  : b
2nd char(1) to last,      (^1) «3rd»  : bc
Now going a bit crazy ..
 2nd-to-last(^2) «2nd» char to 3rd(2)         : bc
 2nd-to-last(^2) «2nd» char to last(^1) «3rd» : bc
 3rd-to-last(^3) «1st» char to 3rd(2)         : abc

Notes #

  • It is recommended to always use a space around the .. and ..< binary operators to get consistent results (and no compilation errors!). Examples: [0 ..< str.high], [0 .. str.high], [0 .. ^2], [ .. ^2]. This is based on the tip by Andreas Rumpf (also one of the core devs of Nim). You will find the full discussion around this topic of dots and spaces in Nim Issue #6216.

    Special ascii chars like % . & $ are collected into a single operator token. – Araq

  • To repeat: Always add a space around the .. and ..< operators.

  • As of commit 70ea45cdba, the < unary operator is deprecated! So do [0 ..< str.high] instead of [0 .. <str.high] (see Nim Issue #6788).

  • With the example str variable value being "abc", earlier both str[0 .. ^5] and str[1 .. ^4] returned an empty string incorrectly! (see Nim Issue #6223). That now got fixed in commit b74a5148a9. After the fix, those will cause this error:

    system.nim(3534)         []
    system.nim(2819)         sysFatal
    Error: unhandled exception: value out of range: -1 [RangeError]

    Also after this fix, str[0 .. ^0] outputs abc^@ (where ^@ is the representation of NULL character).. very cool!

All characters except first #

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str[1:])
	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# echo str[1 .. ] # Does not work.. Error: expression expected, but found ']'
# https://github.com/nim-lang/Nim/issues/6212
# Always add a space around the .. and ..< operators
echo str[1 .. str.high]
# or
echo str[1 .. ^1] # second(1) to last(^1)
	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx
	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx

All characters except first and last #

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str[1:-1])
	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# Always add a space around the .. and ..< operators
echo str[1 ..< str.high]
# or
echo str[1 .. ^2] # second(1) to second-to-last(^2)
	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy
	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy

Count #

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.count('a'))
print(str.count('de'))
4
2
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.count('a')
echo str.count("de")
4
2

Starts/ends with #

Starts With #

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.startswith('a'))
print(str.startswith('a\t'))
print(str.startswith('z'))
True
True
False
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.startsWith('a') # Recommended Nim style
# or
echo str.startswith('a')
# or
echo str.starts_with('a')
echo str.startsWith("a\t")
echo str.startsWith('z')
true
true
true
true
false

Notes #

  • All Nim identifiers are case and underscore insensitive (except for the first character of the identifier), as seen in the above example. So any of startsWith or startswith or starts_with would work the exact same way.
  • Though, it has to be noted that using the camelCase variant (startsWith) is preferred in Nim.

Ends With #

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.endswith('x'))
print(str.endswith('yx'))
print(str.endswith('z'))
True
True
False
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.endsWith('x')
echo str.endsWith("yx")
echo str.endsWith('z')
true
true
false

Expand Tabs #

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.expandtabs())
print(str.expandtabs(4))
a       bc      def     aghij   cklm    danopqrstuv     adefwxyz        zyx
a   bc  def aghij   cklm    danopqrstuv adefwxyz    zyx
import strmisc
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.expandTabs()
echo str.expandTabs(4)
a       bc      def     aghij   cklm    danopqrstuv     adefwxyz        zyx
a   bc  def aghij   cklm    danopqrstuv adefwxyz    zyx

Find/Index #

Find (from left) #

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.find('a'))
print(str.find('b'))
print(str.find('c'))
print(str.find('zyx'))
print(str.find('aaa'))
0
2
3
41
-1
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.find('a')
echo str.find('b')
echo str.find('c')
echo str.find("zyx")
echo str.find("aaa")
0
2
3
41
-1

Find from right #

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.rfind('a'))
print(str.rfind('b'))
print(str.rfind('c'))
print(str.rfind('zyx'))
print(str.rfind('aaa'))
32
2
15
41
-1
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.rfind('a')
echo str.rfind('b')
echo str.rfind('c')
echo str.rfind("zyx")
echo str.rfind("aaa")
32
2
15
41
-1

Index (from left) #

From Python 3 docs,

Like find(), but raise ValueError when the substring is not found.

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.index('a'))
print(str.index('b'))
print(str.index('c'))
print(str.index('zyx'))
# print(str.index('aaa')) # Throws ValueError: substring not found
0
2
3
41

Nim does not have an error raising index function like that out-of-box, but something like that can be done with:

import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# https://nim-lang.org/docs/strutils.html#find,string,string,Natural,Natural
# proc find(s, sub: string; start: Natural = 0; last: Natural = 0): int {..}
proc index(s, sub: auto; start: Natural = 0; last: Natural = 0): int =
  result = s.find(sub, start, last)
  if result<0:
    raise newException(ValueError, "$1 not found in $2".format(sub, s))

echo str.index('a')
echo str.index('b')
echo str.index('c')
echo str.index("zyx")
# echo str.index("aaa") # Error: unhandled exception: aaa not found in a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx [ValueError]
0
2
3
41

Notes #

  • No Nim equivalent, but I came up with my own index proc for Nim above.

Index from right #

From Python 3 docs,

Like rfind(), but raise ValueError when the substring is not found.

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.rindex('a'))
print(str.rindex('b'))
print(str.rindex('c'))
print(str.rindex('zyx'))
# print(str.rindex('aaa')) # Throws ValueError: substring not found
32
2
15
41

Nim does not have an error raising rindex function like that out-of-box, but something like that can be done with:

import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# https://nim-lang.org/docs/strutils.html#rfind,string,string,int
# proc rfind(s, sub: string; start: int = - 1): int {..}
proc rindex(s, sub: auto; start: int = -1): int =
  result = s.rfind(sub, start)
  if result<0:
    raise newException(ValueError, "$1 not found in $2".format(sub, s))

echo str.rindex('a')
echo str.rindex('b')
echo str.rindex('c')
echo str.rindex("zyx")
# echo str.rindex("aaa") # Error: unhandled exception: aaa not found in a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx [ValueError]
32
2
15
41

Notes #

  • No Nim equivalent, but I came up with my own rindex proc for Nim above.

Format #

print('{} {}'.format(1, 2))
print('{} {}'.format('a', 'b'))
1 2
a b
import strutils
# echo "$1 $2" % [1, 2] # This gives error. % cannot have int list as arg, has to be string list
# echo "$1 $2" % ['a', 'b'] # This gives error. % cannot have char list as arg, has to be string list
echo "$1 $2" % ["a", "b"]
# or
echo "$1 $2".format(["a", "b"])
# or
echo "$1 $2".format("a", "b")
# or
echo "$1 $2".format('a', 'b') # format, unlike % does auto-stringification of the input
echo "$1 $2".format(1, 2)
a b
a b
a b
a b
1 2

Using strfmt module #

import strfmt
echo "{} {}".fmt(1, 0)
echo "{} {}".fmt('a', 'b')
echo "{} {}".fmt("abc", "def")
echo "{0} {1} {0}".fmt(1, 0)
echo "{0.x} {0.y}".fmt((x: 1, y:"foo"))
1 0
a b
abc def
1 0 1
1 foo

Notes #

  • Thanks to the tip by Daniil Yarancev, I can use the strfmt module to get Python .format()-like formatting function.
  • You will need to install it first by doing nimble install strfmt.
  • See this for full documentation on strfmt. The awesome thing is that it allows using a format syntax that’s similar to Python’s Format Specification Mini-Language 🙌

String Predicates #

Is Alphanumeric? #

print('abc'.isalnum())
print('012'.isalnum())
print('abc012'.isalnum())
print('abc012_'.isalnum())
print('{}'.isalnum())
print('Unicode:')
print('ábc'.isalnum())
True
True
True
False
False
Unicode:
True
import strutils
echo "abc".isAlphaNumeric()
echo "012".isAlphaNumeric()
echo "abc012".isAlphaNumeric()
echo "abc012_".isAlphaNumeric()
echo "{}".isAlphaNumeric()
echo "[Wrong] ", "ábc".isAlphaNumeric() # Returns false! isAlphaNumeric works only for ascii.
true
true
true
false
false
[Wrong] false

TODO Figure out how to write unicode-equivalent of isAlphaNumeric #

Is Alpha? #

print('abc'.isalpha())
print('012'.isalpha())
print('abc012'.isalpha())
print('abc012_'.isalpha())
print('{}'.isalpha())
print('Unicode:')
print('ábc'.isalpha())
True
False
False
False
False
Unicode:
True
import strutils except isAlpha
import unicode
echo "abc".isAlphaAscii()
echo "012".isAlphaAscii()
echo "abc012".isAlphaAscii()
echo "abc012_".isAlphaAscii()
echo "{}".isAlphaAscii()
echo "Unicode:"
echo unicode.isAlpha("ábc")
# or
echo isAlpha("ábc") # unicode prefix is not needed
                    # because of import strutils except isAlpha
# or
echo "ábc".isAlpha() # from unicode
true
false
false
false
false
Unicode:
true
true
true

Notes #

  • Thanks to the tip from Dominik Picheta on the use of except in import:

    import strutils except isAlpha
    import unicode

    That prevents the ambiguous call error like below as we are specifying that import everything from strutils, except for the isAlpha proc. Thus the unicode version of isAlpha proc is used automatically.

    nim_src_28505flZ.nim(14, 13) Error: ambiguous call; both strutils.isAlpha(s: string)[declared in lib/pure/strutils.nim(289, 5)] and unicode.isAlpha(s: string)[declared in lib/pure/unicode.nim(1416, 5)] match for: (string)

Is Digit? #

print('abc'.isdigit())
print('012'.isdigit())
print('abc012'.isdigit())
print('abc012_'.isdigit())
print('{}'.isdigit())
False
True
False
False
False
import strutils
echo "abc".isDigit()
echo "012".isDigit()
echo "abc012".isDigit()
echo "abc012_".isDigit()
echo "{}".isDigit()
false
true
false
false
false

Is Lower? #

print('a'.islower())
print('A'.islower())
print('abc'.islower())
print('Abc'.islower())
print('aBc'.islower())
print('012'.islower())
print('{}'.islower())
print('ABC'.islower())
print('À'.islower())
print('à'.islower())
True
False
True
False
False
False
False
False
False
True
import strutils except isLower
import unicode
echo 'a'.isLowerAscii()
echo 'A'.isLowerAscii()
echo "abc".isLowerAscii()
echo "Abc".isLowerAscii()
echo "aBc".isLowerAscii()
echo "012".isLowerAscii()
echo "{}".isLowerAscii()
echo "ABC".isLowerAscii()
echo "À".isLowerAscii()
echo "[Wrong] ", "à".isLowerAscii() # Returns false! As the name suggests, works only for ascii.
echo "À".isLower() # from unicode
echo "à".isLower() # from unicode
# echo "à".unicode.isLower() # Does not work, Error: undeclared field: 'unicode'
true
false
true
false
false
false
false
false
false
[Wrong] false
false
true

Notes #

  1. isLower from strutils is deprecated. Use isLowerAscii instead, or isLower from unicode (as done above).
  2. To check if a non-ascii alphabet is in lower case, use unicode.isLower.

Is Upper? #

print('a'.isupper())
print('A'.isupper())
print('abc'.isupper())
print('Abc'.isupper())
print('aBc'.isupper())
print('012'.isupper())
print('{}'.isupper())
print('ABC'.isupper())
print('À'.isupper())
print('à'.isupper())
False
True
False
False
False
False
False
True
True
False
import strutils except isUpper
import unicode
echo 'a'.isUpperAscii()
echo 'A'.isUpperAscii()
echo "abc".isUpperAscii()
echo "Abc".isUpperAscii()
echo "aBc".isUpperAscii()
echo "012".isUpperAscii()
echo "{}".isUpperAscii()
echo "ABC".isUpperAscii()
echo "[Wrong] ", "À".isUpperAscii() # Returns false! As the name suggests, works only for ascii.
echo "à".isUpperAscii()
echo "À".isUpper() # from unicode
echo "à".isUpper() # from unicode
# echo "À".unicode.isUpper() # Does not work, Error: undeclared field: 'unicode'
false
true
false
false
false
false
false
true
[Wrong] false
false
true
false

Notes #

  1. isUpper from strutils is deprecated. Use isUpperAscii instead, or isUpper from unicode (as done above).
  2. To check if a non-ascii alphabet is in upper case, use unicode.isUpper.

Is Space? #

print(''.isspace())
print(' '.isspace())
print('\t'.isspace())
print('\r'.isspace())
print('\n'.isspace())
print(' \t\n'.isspace())
print('abc'.isspace())
print('Testing with ZERO WIDTH SPACE unicode character below:')
print('[Wrong] {}'.format('​'.isspace())) # Returns false! That's, I believe, incorrect behavior.
False
True
True
True
True
True
False
Testing with ZERO WIDTH SPACE unicode character below:
[Wrong] False
import strutils except isSpace
import unicode
echo "".isSpaceAscii() # empty string has to be in double quotes
echo ' '.isSpaceAscii()
echo '\t'.isSpaceAscii()
echo '\r'.isSpaceAscii()
echo "\n".isSpaceAscii() # \n is a string, not a character in Nim
echo " \t\n".isSpaceAscii()
echo "abc".isSpaceAscii()
echo "Testing with ZERO WIDTH SPACE unicode character below:"
echo "[Wrong] ", "​".isSpaceAscii() # Returns false! As the name suggests, works only for ascii.
echo "​".isSpace() # from unicode
false
true
true
true
true
true
false
Testing with ZERO WIDTH SPACE unicode character below:
[Wrong] false
true

Notes #

  1. Empty string results in a false result for both Python and Nim variants of isspace.
  2. \n is a string, not a character in Nim, because based on the OS, \n can comprise of one or more characters.
  3. isSpace from strutils is deprecated. Use isSpaceAscii instead, or isSpace from unicode (as done above).
  4. To check if a non-ascii alphabet is in space case, use unicode.isSpace.
  5. Interestingly, Nim’s isSpace from unicode module returns true for ZERO WIDTH SPACE unicode character (0x200b) as input, but Python’s isspace returns false. I believe Python’s behavior here is incorrect.

Is Title? #

print(''.istitle())
print('this is not a title'.istitle())
print('This Is A Title'.istitle())
print('This Is À Title'.istitle())
print('This Is Not a Title'.istitle())
False
False
True
True
False
import unicode
echo "".isTitle()
echo "this is not a title".isTitle()
echo "This Is A Title".isTitle()
echo "This Is À Title".isTitle()
echo "This Is Not a Title".isTitle()
false
false
true
true
false

Join #

print(' '.join(['a', 'b', 'c']))
print('xx'.join(['a', 'b', 'c']))
a b c
axxbxxc
import strutils
echo "Sequences:"
# echo @["a", "b", "c"].join(' ') # Error: type mismatch: got (seq[string], char)
echo @["a", "b", "c"].join(" ")
echo join(@["a", "b", "c"], " ")
echo "Lists:"
echo ["a", "b", "c"].join(" ") # Works after Nim issue # 6210 got fixed.
echo (["a", "b", "c"].join(" ")) # Works!
echo join(["a", "b", "c"], " ") # Works!
var list = ["a", "b", "c"]
echo list.join(" ") # Works too!
echo @["a", "b", "c"].join("xx")
Sequences:
a b c
a b c
Lists:
a b c
a b c
a b c
a b c
axxbxxc

Notes #

  1. The second arg to join, the separator argument has to be a string, cannot be a character.
  2. echo ["a", "b", "c"].join(" ") did not work prior to the fix in commit ddc131cf07 (see Nim Issue #6210), but now it does!

Justify with filling #

Center Justify with filling #

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.center(80))
print(str.center(80, '*'))
                  a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx
******************a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx******************
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.center(80)
echo str.center(80, '*')
# or
echo center(str, 80, '*')
                  a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx
******************a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx******************
******************a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx******************

Left Justify with filling #

print('abc'.ljust(2, '*'))
print('abc'.ljust(10, '*'))
abc
abc*******
import strutils
proc ljust(s: string; count: Natural; padding = ' '): string =
  result = s
  let strlen: int = len(s)
  if strlen < count:
    result.add(padding.repeat(count-strlen))

echo "abc".ljust(2, '*')
echo "abc".ljust(10, '*')
abc
abc*******

Notes #

  • No Nim equivalent, but I came up with my own ljust proc for Nim above.

Right Justify with filling #

print('abc'.rjust(10, '*'))
*******abc
import strutils
echo "abc".align(10, '*')
*******abc

Zero Fill #

print('42'.zfill(5))
print('-42'.zfill(5))
print(' -42'.zfill(5))
00042
-0042
0 -42
import strutils
echo "Using align:"
echo "42".align(5, '0')
echo "-42".align(5, '0')

echo "Using zfill:"
proc zfill(s: string; count: Natural): string =
  let strlen: int = len(s)
  if strlen < count:
    if s[0]=='-':
      result = "-"
      result.add("0".repeat(count-strlen))
      result.add(s[1 .. s.high])
    else:
      result = "0".repeat(count-strlen)
      result.add(s)
  else:
    result = s

echo "42".zfill(5)
echo "-42".zfill(5)
echo " -42".zfill(5)
Using align:
00042
00-42
Using zfill:
00042
-0042
0 -42
Notes #
  • The align in Nim does not do the right thing as the Python zfill does when filling zeroes on the left in strings representing negative numbers.
  • No Nim equivalent, but I came up with my own zfill proc for Nim above.

Case conversion #

To Lower #

print('a'.lower())
print('A'.lower())
print('abc'.lower())
print('Abc'.lower())
print('aBc'.lower())
print('012'.lower())
print('{}'.lower())
print('ABC'.lower())
print('À'.lower())
print('à'.lower())
a
a
abc
abc
abc
012
{}
abc
à
à
import strutils except toLower
import unicode
echo 'a'.toLowerAscii()
echo 'A'.toLowerAscii()
echo "abc".toLowerAscii()
echo "Abc".toLowerAscii()
echo "aBc".toLowerAscii()
echo "012".toLowerAscii()
echo "{}".toLowerAscii()
echo "ABC".toLowerAscii()
echo "[Wrong] ", "À".toLowerAscii() # Does not work! As the name suggests, works only for ascii.
echo "à".toLowerAscii()
echo "À".toLower() # from unicode
echo "à".toLower() # from unicode
a
a
abc
abc
abc
012
{}
abc
[Wrong] À
à
à
à

Notes #

  1. toLower from strutils is deprecated. Use toLowerAscii instead, or toLower from unicode (as done above).
  2. To convert a non-ascii alphabet to lower case, use unicode.toLower.

To Upper #

print('a'.upper())
print('A'.upper())
print('abc'.upper())
print('Abc'.upper())
print('aBc'.upper())
print('012'.upper())
print('{}'.upper())
print('ABC'.upper())
print('À'.upper())
print('à'.upper())
A
A
ABC
ABC
ABC
012
{}
ABC
À
À
import strutils except toUpper
import unicode
echo 'a'.toUpperAscii()
echo 'A'.toUpperAscii()
echo "abc".toUpperAscii()
echo "Abc".toUpperAscii()
echo "aBc".toUpperAscii()
echo "012".toUpperAscii()
echo "{}".toUpperAscii()
echo "ABC".toUpperAscii()
echo "À".toUpperAscii()
echo "[Wrong] ", "à".toUpperAscii() # Does not work! As the name suggests, works only for ascii.
echo "À".toUpper() # from unicode
echo "à".toUpper() # from unicode
A
A
ABC
ABC
ABC
012
{}
ABC
À
[Wrong] à
À
À

Notes #

  1. toUpper from strutils is deprecated. Use toUpperAscii instead, or toUpper from unicode (as done above).
  2. To convert a non-ascii alphabet to upper case, use unicode.toUpper.

Capitalize #

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.capitalize())
A	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.capitalizeAscii
# or
echo capitalizeAscii(str)
A	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx
A	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx

To Title #

print('convert this to title á û'.title())
Convert This To Title Á Û
import unicode
echo "convert this to title á û".title()
Convert This To Title Á Û

Swap Case #

print('Swap CASE example á û Ê'.swapcase())
print('Swap CASE example á û Ê'.swapcase().swapcase())
sWAP case EXAMPLE Á Û ê
Swap CASE example á û Ê
import unicode
echo "Swap CASE example á û Ê".swapcase()
echo "Swap CASE example á û Ê".swapcase().swapcase()
sWAP case EXAMPLE Á Û ê
Swap CASE example á û Ê

Notes #

  • See this SO Q/A to read about few cases where s.swapcase().swapcase()==s is not true (at least for Python).

Strip #

Left/leading and right/trailing Strip #

print('«' + '   spacious   '.strip() + '»')
print('www.example.com'.strip('cmowz.'))
print('mississippi'.strip('mipz'))
«spacious»
example
ssiss
import strutils
echo "«", "   spacious   ".strip(), "»"
echo "www.example.com".strip(chars={'c', 'm', 'o', 'w', 'z', '.'})
echo "mississippi".strip(chars={'m', 'i', 'p', 'z'})
«spacious»
example
ssiss

Notes #

  • Python strip takes a string as an argument to specify the letters that need to be stripped off the input string. But Nim strip requires a Set of characters.

Left/leading Strip #

print('«' + '   spacious   '.lstrip() + '»')
print('www.example.com'.lstrip('cmowz.'))
print('mississippi'.lstrip('mipz'))
«spacious   »
example.com
ssissippi
import strutils
echo "«", "   spacious   ".strip(trailing=false), "»"
echo "www.example.com".strip(trailing=false, chars={'c', 'm', 'o', 'w', 'z', '.'})
echo "mississippi".strip(trailing=false, chars={'m', 'i', 'p', 'z'})
«spacious   »
example.com
ssissippi

Right/trailing Strip #

print('«' + '   spacious   '.rstrip() + '»')
print('www.example.com'.rstrip('cmowz.'))
print('mississippi'.rstrip('mipz'))
«   spacious»
www.example
mississ
import strutils
echo "«", "   spacious   ".strip(leading=false), "»"
echo "www.example.com".strip(leading=false, chars={'c', 'm', 'o', 'w', 'z', '.'})
echo "mississippi".strip(leading=false, chars={'m', 'i', 'p', 'z'})
«   spacious»
www.example
mississ

Partition #

First occurrence partition #

print('ab:ce:ef:ce:ab'.partition(':'))
print('ab:ce:ef:ce:ab'.partition('ce'))
('ab', ':', 'ce:ef:ce:ab')
('ab:', 'ce', ':ef:ce:ab')
import strmisc
echo "ab:ce:ef:ce:ab".partition(":") # The argument is a string, not a character
echo "ab:ce:ef:ce:ab".partition("ce")
(Field0: "ab", Field1: ":", Field2: "ce:ef:ce:ab")
(Field0: "ab:", Field1: "ce", Field2: ":ef:ce:ab")

Right partition or Last occurrence partition #

print('ab:ce:ef:ce:ab'.rpartition(':'))
print('ab:ce:ef:ce:ab'.rpartition('ce'))
('ab:ce:ef:ce', ':', 'ab')
('ab:ce:ef:', 'ce', ':ab')
import strmisc
echo "ab:ce:ef:ce:ab".rpartition(":") # The argument is a string, not a character
# or
echo "ab:ce:ef:ce:ab".partition(":", right=true)
echo "ab:ce:ef:ce:ab".rpartition("ce")
# or
echo "ab:ce:ef:ce:ab".partition("ce", right=true)
(Field0: "ab:ce:ef:ce", Field1: ":", Field2: "ab")
(Field0: "ab:ce:ef:ce", Field1: ":", Field2: "ab")
(Field0: "ab:ce:ef:", Field1: "ce", Field2: ":ab")
(Field0: "ab:ce:ef:", Field1: "ce", Field2: ":ab")

Replace #

print('abc abc abc'.replace(' ab', '-xy'))
print('abc abc abc'.replace(' ab', '-xy', 0))
print('abc abc abc'.replace(' ab', '-xy', 1))
print('abc abc abc'.replace(' ab', '-xy', 2))
abc-xyc-xyc
abc abc abc
abc-xyc abc
abc-xyc-xyc
import strutils
echo "abc abc abc".replace(" ab", "-xy")
# echo "abc abc abc".replace(" ab", "-xy", 0) # Invalid, does not expect a count:int argument
# echo "abc abc abc".replace(" ab", "-xy", 1) # Invalid, does not expect a count:int argument
# echo "abc abc abc".replace(" ab", "-xy", 2) # Invalid, does not expect a count:int argument
abc-xyc-xyc

Notes #

  • Nim does not allow specifying the number of occurrences to be replaced using a count argument as in the Python version of replace.

Split #

Split (from left) #

print('1,2,3'.split(','))
print('1,2,3'.split(',', maxsplit=1))
print('1,2,,3,'.split(','))
print('1::2::3'.split('::'))
print('1::2::3'.split('::', maxsplit=1))
print('1::2::::3::'.split('::'))
['1', '2', '3']
['1', '2,3']
['1', '2', '', '3', '']
['1', '2', '3']
['1', '2::3']
['1', '2', '', '3', '']
import strutils
echo "1,2,3".split(',')
echo "1,2,3".split(',', maxsplit=1)
echo "1,2,,3,".split(',')
echo "1::2::3".split("::")
echo "1::2::3".split("::", maxsplit=1)
echo "1::2::::3::".split("::")
@["1", "2", "3"]
@["1", "2,3"]
@["1", "2", "", "3", ""]
@["1", "2", "3"]
@["1", "2::3"]
@["1", "2", "", "3", ""]

Split from right #

rsplit behaves just like split unless the maxsplit argument is given

print('1,2,3'.rsplit(','))
print('1,2,3'.rsplit(',', maxsplit=1))
print('1,2,,3,'.rsplit(','))
print('1::2::3'.rsplit('::'))
print('1::2::3'.rsplit('::', maxsplit=1))
print('1::2::::3::'.rsplit('::'))
['1', '2', '3']
['1,2', '3']
['1', '2', '', '3', '']
['1', '2', '3']
['1::2', '3']
['1', '2', '', '3', '']
import strutils
echo "1,2,3".rsplit(',')
echo "1,2,3".rsplit(',', maxsplit=1)
echo "1,2,,3,".rsplit(',')
echo "1::2::3".rsplit("::")
echo "1::2::3".rsplit("::", maxsplit=1)
echo "1::2::::3::".rsplit("::")
@["1", "2", "3"]
@["1,2", "3"]
@["1", "2", "", "3", ""]
@["1", "2", "3"]
@["1::2", "3"]
@["1", "2", "", "", "3", ""]

Split Lines #

splits = 'ab c\n\nde fg\rkl\r\n'.splitlines()
print(splits)
print('ab c\n\nde fg\rkl\r\n'.splitlines(keepends=True))
['ab c', '', 'de fg', 'kl']
['ab c\n', '\n', 'de fg\r', 'kl\r\n']
import strutils
var splits: seq[string] = "ab c\n\nde fg\rkl\r\n".splitLines()
echo splits
@["ab c", "", "de fg", "kl", ""]

Notes #

  • The Nim version of splitLines does not have a second argument like keepends in the Python version splitlines.
  • Also the Nim version creates separate splits for the \r and \n. Note the last "" split created by Nim, but not by Python in the above example.

Convert #

See the encodings module for equivalents of Python decode and encode functions.

Others #

There is no equivalent for the Python translate function , in Nim as of writing this (<2017-08-09 Wed>).

References #

Load Comments