Connecting...

W1siziisimnvbxbpbgvkx3rozw1lx2fzc2v0cy9zawduawz5lxrly2hub2xvz3kvanbnl2jhbm5lci1kzwzhdwx0lmpwzyjdxq

Python vs. Scala: a comparison of the basic commands (Part I) by Emma Grimaldi

W1siziisijiwmtkvmdqvmtevmtuvmdivndcvnzu5l2nhbmr5lwnvbg9ylwnyzwf0axzlltu0njmzlmpwzyjdlfsiccisinrodw1iiiwiotawedkwmfx1mdazzsjdxq

Are you proficient in Python but considering stepping into Scala? 

Data Analyst, Emma Grimaldi has given us her reflections on her experience of learning how to handle strings, lists, dictionaries and more.

What has your experience of moving to another language been?


'I recently started playing a little bit with Scala, and I have to say it has been kind of traumatic. I love learning new things but after months of programming with Python, it is just not natural to set that aside and switch mode while solving Data Science problems. When learning a new language, whether it is a coding or a spoken one, it is normal for this to happen. We tend to fill in the gaps of the things we don’t know with the things we know, even if they don’t belong to the language we are trying to write/speak! When trying to learn a new language, it is important to be completely surrounded by the language you want to learn, but first of all, it is important to have well-established parallelisms between the known and the new language, at least in the beginning. This works for me, a bilingual person who learned a second language really quickly, at an adult age. At the beginning, I needed connections between Italian (the language I knew) and English (the language I was learning), but as I became more and more fluent in English, I started to forget the parallelisms because it was just becoming natural and I didn’t need to translate it in my head first, anymore. The reason why I decided to write this post is, in fact, to establish parallelisms between Python and Scala, for people who are fluent in one of the two, and are starting to learn the other one, like myself.

I initially wanted to focus on Pandas/Sklearn and Spark, but I realized that it doesn’t make much sense without covering the foundations first. This is why in this post we’ll look at the basics of Python and Scala: how to handle strings, lists, dictionaries and so on. I intend in the near future to publish a second part, where I will cover how to handle dataframes and build predictive models in both languages.


1. First things first

The first difference is the convention used when coding is these two languages: this will not throw an error or anything like that if you don’t follow it, but it’s just a non-written rule that coders follow.

When defining a new variable, function or whatever, we always pick a name that makes sense to us, that most likely will be composed by two or more words. If this is the case, in Python we will use 'snake_case', while in Scala 'camelCase': the difference is immediately noticeable. In snake case, all words all lower-case and we use '_' to separate them, in camel case there is no separation, and all words are capitalized except for the first one.

Another striking difference is how we define the variables in the two languages. In Python we just make up a name and assign it to the value we need it to be, while in Scala, we need to specify whether we are defining a variable or a value, and we do this by placing 'var' or 'val' respectively, before the name (notice that this is valid whether we are assigning numerical values or strings).

The difference between 'var' and 'val' is simple: variables can be modified, while values cannot. In the example represented in the image, I instantiated a 'var' string and then changed it: all good. Then, I assigned the same string to a val and tried to change it again: not doable.

In Python there is no need to specify: if you want to change something you previously assigned, it’s up to you. In Python’s case I would just do 'string = 'my_string''.

Initializing values and variables in Scala.

Another general difference regards commenting. In Python there is only one way to do it, whether it’s a single or multi-line, and that is putting a '#' before the comment, on each line:

'# this is a commented line in Python'

Scala offers a couple of ways to comment, and these are either putting '//' on each line, or wrap the comment between '/*' and '*/':

Now that the very basics are explained, let’s see dive deeper.


2. Lists and arrays

List (in Python) or Array (in Scala) are among the most important objects: they can contain strings and/or numbers, we can manipulate them, iterate over them, add or subtract elements and so on. They can basically serve any purposes, and I don’t think I have ever coded anything without using them, so let’s see what we can do with them, and how.

2.1. Define

Let’s create a list containing a mix of numbers and strings.

2.2. Indexing

Both lists and arrays are zero indexed, which means that the first element is placed at the index 0. So, if we want to extract the second element:

2.3. Slicing

In both languages, the second index will not be counted when slicing. So, if we want to extract the first 3 elements:

2.4. Checking first, last, maximum and minimum element


2.5. Sum and product

These operations, as for min and max, will be supported only if the lists/arrays contain exclusively numbers. Also, to multiply all the elements in a Python’s list, we will need to set up a 'for' loop, which will be covered further down in the post. There is no preloaded function for that, as opposed to Scala.

2.6. Adding elements

Lists and arrays are not ordered, so it’s common practice to add elements at the end. Let’s say we want to add the string '"last words"':

If, for some reason, we want to add something at the very beginning, let’s say the number '99':

3. Print

This is also something that we use all the time while coding, luckily there is a only a slight difference between the two languages.

4. For loop

Quite a few differences here: while Python requires indentation to create a block and colon after the statement, Scala wants the for conditions in parenthesis, and the block in curly brackets with no indentation needed. I like to use indentation anyway though, it makes the code look neater.

5. Mapping and/or filtering

All things that, in Python, can be done by using list comprehensions. In Scala we will have to use functions.

5.1. Mapping

Let’s say we have a list/array with only numeric values and we want to triple all of them.

5.2. Filtering

Let’s say we have a list/array with only numeric values and we want to filter only those divisible by 3.


5.3. Filtering and mapping

What if we want to find the even numbers and multiply only them by 3?


6. Dictionaries/Maps

Although they have different names in the two languages, they are exactly the same thing. They both have 'keys' to which we assign 'values'.

6.1. Create dictionary/map

Let’s create one storing my first, last name and age… and let’s also pretend I am 18.

In Scala we can do this in two different ways.

6.2. Adding to dictionary/map

Let’s add my Country of origin to my dictionary/map.

6.3. Indexing

This works the same way as indexing lists/array, but instead of positions, we are using keys. If I want to see my first name:


6.4. Looping

If we want to print the dictionary/map, we will have to for loop in both cases, over keys and values.

7. Tuples

Yes, they are called the same in both languages! But, while they are zero-index in Python, they are not in Scala. Let’s create a tuple '(1, 2, 3)' and then call the first value.


8. Sets

Yes, another name in common! In both examples below, the sets will contain only '1, 3, 5' because sets don’t accept duplicates.


9. Functions

We have covered a lot so far, good job if you made it down here! This is the last thing paragraph of this post, and luckily defining a function is not that different between Python and Scala. They both start with 'def' and while the former requires a 'return' statement, the latter does not. On the other hand, Scala wants to know what types of variables we are going to input and output, while Python doesn’t care. Let’s write a very simple function that takes a string as input and returns the first 5 characters.

Indentation is also important in Python, or the function will not work. Scala instead just likes its curly braces.

That’s it! I hope you found this helpful as an immediate reference for those of you who are just starting to get familiar with either Python or Scala. The following step will be to build a similar guide to explore the differences between pandas/sklearn and sparks, looking forward to it! I hope you do as well!



If you are wondering why you should use Python rather than Scala, or vice versa, I found the image below rather helpful in clarifying the immediate differences between the two.

This article was written by Emma Grimaldi and posted originally on Medium