[Scala] Fundamental Concurrency Problems and Solutions

Featured image

Photo by Logan Voss on Unsplash

The best way to approach concurrency is to master the fundamentals. Below is a small collection of foundational concurrency problems with practical solutions using Scala and Cats Effect.

Conclusion

In this post, we explore four fundamental concurrency problems and their idiomatic solutions using Scala and Cats Effect, but which could be generalized to any programming language with concurrency support.

After reading this post, you should be able to:

  • Mutex for synchronizing access to shared resources
  • Semaphore for limiting the number of concurrent tasks
  • CyclicBarrier for synchronizing tasks at a common point
  • CountDownLatch for waiting for a signal before proceeding with tasks

Problem 1: Counter Updated in Parallel (Mutex)

❌ The Problem

Suppose we have a simple counter that is incremented by 1000 tasks running in parallel. Without synchronization, we get incorrect results.

import cats.effect.{ IO, Ref }
import cats.effect.unsafe.implicits.global

import scala.util.Random

@main def mutexDemo(): Unit =
  // Problem
  import cats.syntax.parallel.*

  var counter = 0 // shared, unsafe

  val increment = IO {
    val tmp = counter
    Thread.sleep(Random.between(10, 20)) // it makes the problem more visible, as we wait 
    counter = tmp + 1
  }

  val program =
    for
      _ <- List.fill(1000)(increment).parSequence
      _ <- IO.println(counter)
    yield ()

  program.unsafeRunSync()

Output:

101 //  different results every time you run it.

This is due to counter += 1 being a non-atomic read-modify-write operation. In assembly, it roughly translates to:

mov eax, [counter]  ; read counter
add eax, 1          ; increment
mov [counter], eax  ; write back

When multiple tasks interleave at these instructions, updates are lost.

✅ The Solution: Mutex

To fix the problem, we use a mutex — a mutual exclusion lock that ensures only one task modifies the counter at a time. In Cats Effect, we can implement a mutex using a Semaphore with a single permit:

@main def mutexDemo(): Unit =
  // ...
  counter = 0 // reset

  // Solution
  import cats.effect.std.Semaphore

    val program2 =
      for
        mutex    <- Semaphore[IO](1)
        increment = mutex.permit.use(_ => IO(counter += 1))
        _        <- List.fill(1000)(increment).parSequence
        _        <- IO.println(counter)
      yield ()

    program2.unsafeRunSync()
}

Output:

1000 //  every time you run it

Problem 2: Max 10 Workers in Parallel (Semaphore)

In this example, we want to run 50 tasks but allow only 10 to run in parallel.

Using a mutex (as in the previous example) would require us to manually track the number of active tasks and implement a queue or backpressure mechanism, which adds unnecessary complexity.

Instead, we’ll use a simpler and more idiomatic solution.

❌ The Problem

import cats.effect.std.Semaphore
import cats.effect.{ IO, Ref }
import cats.syntax.parallel.*

import scala.util.Random
import scala.concurrent.duration.*
import cats.effect.unsafe.implicits.global

@main def semaphoreDemo(): Unit =
  // Problem: Run 50 tasks, but only 10 in parallel

  def runTask(id: Int, currRef: Ref[IO, Int], maxSeen: Ref[IO, Int]): IO[Unit] =
    for
      running <- currRef.updateAndGet(_ + 1)
      _       <- maxSeen.update(m => math.max(m, running))
      _       <- IO.println(s"[$id] running (current: $running")
      _       <- IO.sleep((Random.between(100, 300)).millisecond)
      _       <- currRef.update(_ - 1)
    yield ()
  end runTask

  val program: IO[Unit] =
    for
      running <- Ref[IO].of(0)
      maxSeen <- Ref[IO].of(0)
      _       <- (1 to 50).toList.parTraverse(id => runTask(id, running, maxSeen))
      max     <- maxSeen.get
      _       <- IO.println(s"Max concurrently running tasks: $max")
    yield ()

  program.unsafeRunSync()

Output:

Max concurrently running tasks: 50

Clearly not what we want — we expected a cap at 10.

✅ The Solution: Semaphore

A semaphore is a synchronization primitive that limits the number of concurrent accesses to a resource.

Here, we use a semaphore with 10 permits to allow at most 10 tasks to run in parallel.

@main def semaphoreDemo(): Unit =
  // ...

  val program2 =
    for
      runningRef <- Ref[IO].of(0)
      maxSeenRef <- Ref[IO].of(0)
      semaphore  <- Semaphore[IO](10)
      guarded     = (id: Int) => semaphore.permit.use(_ => runTask(id, runningRef, maxSeenRef))
      _          <- (1 to 50).toList.parTraverse(guarded)
      max        <- maxSeenRef.get
      _          <- IO.println(s"Max concurrently running tasks: $max")
    yield ()

  program2.unsafeRunSync()

Each task acquires a permit before running, ensuring that no more than 10 run concurrently. Once a task completes, it releases its permit, allowing another task to proceed.

Outupt: Max concurrently running tasks: 10 // each time you run it.

Problem 3: Wait for All Tasks to Be Ready (Barrier)

In this example, we want to run 5 tasks, but ensure they all wait until everyone is ready before proceeding to the next step.

❌ The Problem

import cats.effect.IO
import cats.effect.std.CyclicBarrier

import scala.util.Random
import scala.concurrent.duration.*
import cats.syntax.parallel.*
import cats.effect.unsafe.implicits.global

@main def barrierDemo(): Unit =
  def task(id: Int) =
    for
      _ <- IO.println(s"Task $id preparing")
      _ <- IO.sleep((Random.between(100, 500)).millis)
      _ <- IO.println(s"Task $id: Preparing done")
      _ <- IO.println(s"Task $id: Doing what I want to do") // no waiting
      _ <- IO.sleep((Random.between(100, 500)).millis)
    yield ()

  val program =
    (1 to 5).toList.parTraverse(task)

  program.unsafeRunSync()

Output:

Task 1 preparing
Task 2 preparing
Task 3 preparing
Task 4 preparing
Task 5 preparing
Task 2: Preparing done
Task 2: Doing what I want to do
Task 3: Preparing done
Task 3: Doing what I want to do
Task 1: Preparing done
Task 1: Doing what I want to do
Task 4: Preparing done
Task 4: Doing what I want to do
Task 5: Preparing done
Task 5: Doing what I want to do

Each task simulates a “preparing” phase, followed by the actual work. However, without coordination, tasks begin their work as soon as they finish preparing. There’s no synchronization point — they continue independently.

This means some tasks might proceed while others are still preparing, which isn’t the desired behavior when you want them to start simultaneously.

✅ The Solution: Barrier

To fix this, we use a CyclicBarrier.

A cyclic barrier is a synchronization primitive that blocks all participating tasks at a certain point (the barrier) until all have reached it. Only once all tasks are ready does the barrier open, allowing them to proceed together.

@main def barrierDemo(): Unit =
  // ...

  def task2(id: Int, barrier: CyclicBarrier[IO]) =
    for
      _ <- IO.println(s"Task $id preparing")
      _ <- IO.sleep((Random.between(100, 500)).millis)
      _ <- IO.println(s"Task $id: Preparing done")
      _ <- barrier.await
      _ <- IO.println(s"Task $id: Doing what I want to do") // no waiting
      _ <- IO.sleep((Random.between(100, 500)).millis)
    yield ()

  val program2 = for
    barrier <- CyclicBarrier[IO](5)
    _       <- (1 to 5).toList.parTraverse(id => task2(id, barrier))
  yield ()

  program2.unsafeRunSync()

Output:

Task 1 preparing
Task 2 preparing
Task 3 preparing
Task 5 preparing
Task 4 preparing
Task 3: Preparing done
Task 1: Preparing done
Task 4: Preparing done
Task 5: Preparing done
Task 2: Preparing done
Task 2: Doing what I what to do
Task 4: Doing what I what to do
Task 5: Doing what I what to do
Task 3: Doing what I what to do
Task 1: Doing what I what to do

Much more obidient tasks!

In our case, after each task finishes preparing, it waits at the barrier. Once all 5 tasks reach it, they continue in parallel.

This guarantees that no task starts the “real” work until everyone is ready — perfect for scenarios where coordination matters, such as simulations, games, or collective task triggers.

Problem 4: Wait Until Signal to Start (Latch)

The final problem in this post is about delaying the start of tasks until a signal is given.

Imagine you have a set of tasks that shouldn’t start immediately — they should wait for a “go” signal.
This is a common pattern in concurrent systems where you want to ensure everything is prepared before kicking off execution.

❌ The Problem

@main def latchesDemo(): Unit =

  def task(id: Int): IO[Unit] =
    IO.println(s"[$id] running too early")

  val program =
    for
      _ <- IO.sleep(1.second)
      _ <- (1 to 5).toList.parTraverse_(task)
    yield ()

  program.unsafeRunSync()

In the naive version, tasks are simply launched after a short delay. There’s no coordination — they run as soon as they’re scheduled. This doesn’t satisfy the requirement of waiting for an external signal.

✅ The Solution: Latch

To fix this, we use a CountDownLatch.

A latch is a synchronization primitive that allows one or more tasks to wait until a signal is given (e.g., a counter hits zero).

@main def latchesDemo(): Unit =
  // ...

  def task2(id: Int, latch: CountDownLatch[IO]) =
    for
      _ <- IO.println(s"[$id] Waiting for signal")
      _ <- latch.await
      _ <- IO.println(s"[$id] Processing after signal (latch released)")
    yield ()

  val program2 =
    for
      latch <- CountDownLatch[IO](1)
      fiber <- (1 to 5).toList.parTraverse_(id => task2(id, latch)).start // ! we start a fiber
      _     <- IO.println("All tasks should wait for signal!")
      _     <- IO.sleep(5.seconds)
      _     <- IO.println("Setup complete")
      _     <- latch.release
      _     <- fiber.join
    yield ()

  program2.unsafeRunSync()

Output:

[2] Waiting for signal
[3] Waiting for signal
[1] Waiting for signal
[4] Waiting for signal
All tasks should wait for signal!
[5] Waiting for signal
Setup complete
[1] Processing after signal (latch released)
[2] Processing after signal (latch released)
[4] Processing after signal (latch released)
[3] Processing after signal (latch released)
[5] Processing after signal (latch released)

In our case, we use a latch with an initial count of 1. All tasks call await, blocking until the latch is released. Once we call release, all tasks proceed.

We also start the tasks inside a fiber so they can run in parallel and reach the waiting point without blocking the rest of the program.

The key moment is when we release the latch — only then do the tasks resume and continue

This pattern is useful when you want a coordinated start after setup or resource allocation is complete.

There are plenty of great books on concurrency, both general and specific to Scala/Java. Here are some recommendations:

General concepts

  • The Little Book of Semaphores – Allen B. Downey
  • Operating Systems: Three Easy Pieces (Part II: Concurrency)
  • The Art of Multiprocessor Programming – Maurice Herlihy & Nir Shavit
  • Modern Operating Systems – Andrew Tanenbaum

Java/Scala Specific

  • Java Concurrency in Practice – Brian Goetz
  • Programming Concurrency on the JVM – Venkat Subramaniam
  • Functional Programming in Scala – Paul Chiusano & Rúnar Bjarnason (Chapter 12: Concurrency)
  • Advanced Scala with Cats – Noel Welsh & Dave Gurnell (Chapter 8: Concurrency)

Distributed Systems

  • Designing Data-Intensive Applications – Martin Kleppmann (Chapter 6: Distributed Systems)
  • Reactive Design Patterns – Roland Kuhn, Jamie Allen & Brian Hanafee (Chapter 2: Concurrency)