Skip to content

Latest commit

 

History

History
4269 lines (3581 loc) · 154 KB

File metadata and controls

4269 lines (3581 loc) · 154 KB

@aseltmann

Homepage

Hi, I’m Alva

I am a researcher and physician, currently working with Marie von Lilienfeld-Toal at the Institute for Diversity Medicine at Ruhr University Bochum, formerly at the Biophysical Imaging Lab of Christian Eggeling in Jena.

Contact

You can get in contact with me by sending an email to alva.seltmann (at) rub (dot) de. You can also find me on Github.

Homepage_de

Hallo, ich bin Alva

Ich bin Wissenschaftlerin und Ärztin, aktuell tätig bei Marie von Lilienfeld-Toal am Institut für Diversitätsmedizin der Ruhr-Universität Bochum, davor in der Biophysical Imaging Gruppe von Christian Eggeling in Jena.

Kontakt

Ich bin per Email erreichbar unter alva.seltmann (at) rub (dot) de. Weiterhin bin ich zu finden auf Github.

Publications

Publications

Publikationen

Publikationen

Notes

Notes

Poetry

Lǎozǐ

I am partial to the translation of Derek Lin

Aphorisms

Test page

Year numbering system

Examples:

Human Era yearCommon Era yearEvent
<r><r>
1001 HE9000 BCEJericho
7301 HE2700 BCEFirst pyramid
11460 HE1460 CEMachu Picchu built
11945 HE1945 CEUnited Nations founded

Wonderful illustration of the case for the Human Era (courtesy of [kurzgesagt.org](https://kurzgesagt.org)):

Structured arguments and thoughts, often presented in lists.

Bold, italic, verbatim, strikethrough

This is a test. And Hugo Book Shortcodes in an org source file:

Buttons

Columns

Expand

Here are the details.

Hints

Display Mode (KaTeX)

This should also be possible in plain org-mode: π(x) vs $π(x)$ vs \(π(x)\)

This works, but requires MathJax:

\begin{equation} \label{eq:1} C = Wlog2 (1+\mathrm{SNR}) \end{equation}

x = \begin{cases} a &\text{if } b
c &\text{if } d \end{cases}

Generate SVG charts and diagrams for text (Mermaid)

Tabs

Science & Medicine

Critical test section

test section

Test section text.

Other critical stuff

Here is other critical stuff.

International

Africa

Notes on specific countries

Phases after WWII

1st phase: 50s/60s - rapid decolonization, optimism, euphoria

  • promise and hope, with freedom comes prosperity
  • movement of French colonies → want to become French citizens
  • movement of British colonies → Pan-Africanism
    • people: activist Alexander Crummell, writer W. E. B. Du Bois (core idea: Black* people should remain in the countries where they now live), journalist Marcus Garvey (core idea: Black* people should return to Africa)
    • core ideas
      1. all Blacks* in the world constitute a single race, a single culture, and should be proud of their colour of skin
      2. all of Africa should be independent and united → “Africa for Africans”
  • BUT: corrupt elites - why?
    • no well-developed private sector, plantations belonged to foreigners, banks belong to foreign capital → political career was only way to richness

2nd phase: 70s/80s - rapid growth of population, pessimism

  • civil wars, revolts, massacres, hunger
    • opponents used all means (tribal and ethnic conflicts, military might, corruption, murder)
    • cold war era → problems + interests of weaker, dependent countries were ignored, subordinate to superpower interests
  • coup d’etat in Nigeria - military seizes power after only 5 years of independence
    • and people celebrate! Since own elite is seen as too corrupt (“black imperialists”, “political wolves, who plunder the country”)

Asia

Another continent to describe.

Section Philosophy

Philosophy

A overview from

Notizen

Notizen

Gedichte

Erich Fried

Lǎozǐ

Ich persönlich mag die Übersetzung von Günther Debon.

Aphorismen

Sektion Philosopie

Philosopie

Ein Überblick aus “Sophies Welt”

Blog Posts

Test post

Test post text.

Test post deutsch

Test post text.

Advent of code 2023 - Day 1: Trebuchet?!

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 1 - see https:adventofcode.com/2023/day/1

Part 1

The newly-improved calibration document consists of lines of text; each line originally contained a specific calibration value that the Elves now need to recover. On each line, the calibration value can be found by combining the first digit and the last digit (in that order) to form a single two-digit number.

For example:

1abc2
pqr3stu8vwx
a1b2c3d4e5f
treb7uchet

In this example, the calibration values of these four lines are 12, 38, 15, and 77. Adding these together produces 142.

Consider your entire calibration document. What is the sum of all of the calibration values?

My input file: 2023-12-01-1-aoc.txt

Lets start jupyter in our shell to start coding!

conda activate tf
jupyter lab --no-browser --port=8888

First, load the test document

import pandas as pd
import re

txt = pd.read_table('data/2023-12-01-1-aoc.txt', names=['code'])
txt
code
0jjfvnnlfivejj1
16fourfour
2ninevbmltwo69
3pcg91vqrfpxxzzzoneightzt
4jpprthxgjfive3one1qckhrptpqdc
995583sevenhjxlqzjgbzxhkcl5
99681s
9972four3threesxxvlfqfive4
998nine6eightsevenzx9twoxc
999hmbfjdfnp989mfivefiverpzrjs

1000 rows × 1 columns

Second, extract the digits. I had to wrap my head around regex matching in python first, because I first tried pandas.extract (which only extracts the first match), then pandas.extractall (which extracts all matches but puts them into a multiindex which makes things more difficult in this case). So I settled for the re.findall version, which returns a list. To concatenate the elements in the list, we take use the join function.

txt['digits'] = txt.loc[:, 'code'].apply(
    lambda x: ''.join(re.findall(r'(\d+)', x)))
txt
codedigits
0jjfvnnlfivejj11
16fourfour6
2ninevbmltwo6969
3pcg91vqrfpxxzzzoneightzt91
4jpprthxgjfive3one1qckhrptpqdc31
995583sevenhjxlqzjgbzxhkcl55835
99681s81
9972four3threesxxvlfqfive4234
998nine6eightsevenzx9twoxc69
999hmbfjdfnp989mfivefiverpzrjs989

1000 rows × 2 columns

Next, combine the first and the last digit and convert the result from string to integer

txt['calibration'] = txt.loc[:, 'digits'].apply(
    lambda x: int(x[0] + x[-1]))
txt
codedigitscalibration
0jjfvnnlfivejj1111
16fourfour666
2ninevbmltwo696969
3pcg91vqrfpxxzzzoneightzt9191
4jpprthxgjfive3one1qckhrptpqdc3131
995583sevenhjxlqzjgbzxhkcl5583555
99681s8181
9972four3threesxxvlfqfive423424
998nine6eightsevenzx9twoxc6969
999hmbfjdfnp989mfivefiverpzrjs98999

1000 rows × 3 columns

Lastly, get the sum of our calibration numbers

txt.loc[:, 'calibration'].sum()

Part 2

Your calculation isn’t quite right. It looks like some of the digits are actually spelled out with letters: one, two, three, four, five, six, seven, eight, and nine also count as valid “digits”.

Equipped with this new information, you now need to find the real first and last digit on each line. For example:

two1nine
eightwothree
abcone2threexyz
xtwone3four
4nineeightseven2
zoneight234
7pqrstsixteen

In this example, the calibration values are 29, 83, 13, 24, 42, 14, and 76. Adding these together produces 281.

What is the sum of all of the calibration values?

Okay, let’s see if we can update the pattern matching. To deal with potential overlapping values like oneight which contains one as well as eight, I used the regex positive lookahead ?= as described here. Because this enables capturing overlapping values, I used \d (one digit) instead of \d+ (one or more digits), otherwise digits might double. Afterwards, just replace the spelled out digits with their numerical value.

# for i, r in enumerate(txt.loc[:, 'code']):
#     matches = re.findall(
#         r'(?=(\d|one|two|three|four|five|six|seven|eight|nine))', r)
#     result = ''.join([match for match in matches])
#     result = result.replace('one', '1').replace('two', '2').replace(
#         'three', '3').replace('four', '4').replace('five', '5').replace(
#         'six', '6').replace('seven', '7').replace('eight', '8').replace(
#         'nine', '9')
#     txt.loc[i, 'digits2'] = result
# txt

# a very nice alternative suggested by Tomalak:
digits = '\d one two three four five six seven eight nine'.split()


txt['digits2'] = txt.loc[:, 'code'].apply(lambda v: ''.join(
    str(digits.index(m)) if m in digits else m
    for m in re.findall(rf'(?=({"|".join(digits)}))', v)
))
txt
codedigitscalibrationdigits2
0jjfvnnlfivejj111151
16fourfour666644
2ninevbmltwo6969699269
3pcg91vqrfpxxzzzoneightzt91919118
4jpprthxgjfive3one1qckhrptpqdc31315311
995583sevenhjxlqzjgbzxhkcl558355558375
99681s818181
9972four3threesxxvlfqfive423424243354
998nine6eightsevenzx9twoxc6969968792
999hmbfjdfnp989mfivefiverpzrjs9899998955

1000 rows × 4 columns

Now, construct the calibration value as before…

txt['calibration2'] = txt.loc[:, 'digits2'].apply(lambda x: int(x[0] + x[-1]))
txt
codedigitscalibrationdigits2calibration2
0jjfvnnlfivejj11115151
16fourfour66664464
2ninevbmltwo696969926999
3pcg91vqrfpxxzzzoneightzt9191911898
4jpprthxgjfive3one1qckhrptpqdc3131531151
995583sevenhjxlqzjgbzxhkcl55835555837555
99681s81818181
9972four3threesxxvlfqfive42342424335424
998nine6eightsevenzx9twoxc696996879292
999hmbfjdfnp989mfivefiverpzrjs989999895595

1000 rows × 5 columns

… and get the correct sum!

txt.loc[:, 'calibration2'].sum()

Advent of code 2023 - Tag 1: Trebuchet?!

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 1 - siehe https:adventofcode.com/2023/day/1

Advent of code 2023 - Day 2: Cube Conundrum

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 2 - see https:adventofcode.com/2023/day/2

Part 1

As you walk, the Elf shows you a small bag and some cubes which are either red, green, or blue. Each time you play this game, he will hide a secret number of cubes of each color in the bag, and your goal is to figure out information about the number of cubes.

To get information, once a bag has been loaded with cubes, the Elf will reach into the bag, grab a handful of random cubes, show them to you, and then put them back in the bag. He’ll do this a few times per game.

You play several games and record the information from each game (your puzzle input). Each game is listed with its ID number (like the 11 in Game 11: ...) followed by a semicolon-separated list of subsets of cubes that were revealed from the bag (like 3 red, 5 green, 4 blue).

For example, the record of a few games might look like this:

Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green
Game 2: 1 blue, 2 green; 3 green, 4 blue, 1 red; 1 green, 1 blue
Game 3: 8 green, 6 blue, 20 red; 5 blue, 4 red, 13 green; 5 green, 1 red
Game 4: 1 green, 3 red, 6 blue; 3 green, 6 red; 3 green, 15 blue, 14 red
Game 5: 6 red, 1 blue, 3 green; 2 blue, 1 red, 2 green

In game 1, three sets of cubes are revealed from the bag (and then put back again). The first set is 3 blue cubes and 4 red cubes; the second set is 1 red cube, 2 green cubes, and 6 blue cubes; the third set is only 2 green cubes.

The Elf would first like to know which games would have been possible if the bag contained only 12 red cubes, 13 green cubes, and 14 blue cubes?

In the example above, games 1, 2, and 5 would have been possible if the bag had been loaded with that configuration. However, game 3 would have been impossible because at one point the Elf showed you 20 red cubes at once; similarly, game 4 would also have been impossible because the Elf showed you 15 blue cubes at once. If you add up the IDs of the games that would have been possible, you get 8.

Determine which games would have been possible if the bag had been loaded with only 12 red cubes, 13 green cubes, and 14 blue cubes. What is the sum of the IDs of those games?

My input file: 2023-12-02-1-aoc.txt

Okay, let’s load our python kernel in emacs-jupyter and get coding! First of all, let’s load the input and split the riddle code by colon : to extract the game id and the rest of the code by semicolon ; to get the number of sets played in each game.

import pandas as pd
import re

txt = pd.read_table('data/2023-12-02-1-aoc.txt', names=['code'])
txt['id'] = txt.loc[:, 'code'].str.split(':').apply(
    lambda x: int(x[0].strip('Game ')))
txt['code'] = txt.loc[:, 'code'].str.split(':').apply(lambda x: x[1])
# txt['code'] = txt.loc[:, 'code'].str.split(';')
# txt['nsets'] = txt.loc[:, 'code'].apply(lambda x: len(x))
txt
codeid
01 green, 1 blue, 1 red; 1 green, 8 red, 7 blu…1
19 red, 7 green, 3 blue; 15 green, 2 blue, 5 r…2
23 red, 1 blue, 4 green; 6 red, 3 green, 2 blu…3
32 blue, 2 green, 19 red; 3 blue, 11 red, 16 g…4
48 green, 1 red, 12 blue; 10 green, 6 red, 13 …5
952 red, 2 green, 1 blue; 1 red, 4 green; 1 green96
964 red, 5 green; 5 blue, 3 red; 8 blue, 2 gree…97
971 blue; 2 green, 1 red; 5 red, 2 green; 4 red…98
986 blue, 5 red, 2 green; 9 red, 1 blue; 2 gree…99
991 blue, 13 green, 14 red; 11 green, 11 blue, …100

100 rows × 2 columns

Now, let’s extract the three colors in different columns with regex. We use the lookahead assertion ?= to find the respective colours and only exctract the digits \d+ coming before. Then we just keep the max imum drawn number of cubes per color, since this is the only information that matters at the moment.

txt['green'] = txt.loc[:, 'code'].apply(
    lambda code: re.findall(r'\d+(?=.green)', code)).apply(
        lambda list: max([int(i) for i in list]))
txt['red'] = txt.loc[:, 'code'].apply(
    lambda code: re.findall(r'\d+(?=.red)', code)).apply(
        lambda list: max([int(i) for i in list]))
txt['blue'] = txt.loc[:, 'code'].apply(
    lambda code: re.findall(r'\d+(?=.blue)', code)).apply(
        lambda list: max([int(i) for i in list]))
txt
codeidgreenredblue
01 green, 1 blue, 1 red; 1 green, 8 red, 7 blu…121010
19 red, 7 green, 3 blue; 15 green, 2 blue, 5 r…215103
23 red, 1 blue, 4 green; 6 red, 3 green, 2 blu…34616
32 blue, 2 green, 19 red; 3 blue, 11 red, 16 g…4162018
48 green, 1 red, 12 blue; 10 green, 6 red, 13 …510614
952 red, 2 green, 1 blue; 1 red, 4 green; 1 green96421
964 red, 5 green; 5 blue, 3 red; 8 blue, 2 gree…97548
971 blue; 2 green, 1 red; 5 red, 2 green; 4 red…98252
986 blue, 5 red, 2 green; 9 red, 1 blue; 2 gree…992911
991 blue, 13 green, 14 red; 11 green, 11 blue, …100131511

100 rows × 5 columns

Lastly, we just filter the DataFrame to only include games where all drawn cubes were below or equal the number of cubes in the game and sum the result!

txt['id'][(txt['green'] < 14) & (txt['red'] < 13) & (txt['blue'] < 15)].sum()

Part 2

As you continue your walk, the Elf poses a second question: in each game you played, what is the fewest number of cubes of each color that could have been in the bag to make the game possible?

Again consider the example games from earlier:

Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green
Game 2: 1 blue, 2 green; 3 green, 4 blue, 1 red; 1 green, 1 blue
Game 3: 8 green, 6 blue, 20 red; 5 blue, 4 red, 13 green; 5 green, 1 red
Game 4: 1 green, 3 red, 6 blue; 3 green, 6 red; 3 green, 15 blue, 14 red
Game 5: 6 red, 1 blue, 3 green; 2 blue, 1 red, 2 green
  • In game 1, the game could have been played with as few as 4 red, 2 green, and 6 blue cubes. If any color had even one fewer cube, the game would have been impossible.
  • Game 2 could have been played with a minimum of 1 red, 3 green, and 4 blue cubes.
  • Game 3 must have been played with at least 20 red, 13 green, and 6 blue cubes.
  • Game 4 required at least 14 red, 3 green, and 15 blue cubes.
  • Game 5 needed no fewer than 6 red, 3 green, and 2 blue cubes in the bag.

The power of a set of cubes is equal to the numbers of red, green, and blue cubes multiplied together. The power of the minimum set of cubes in game 1 is 48. In games 2-5 it was 12, 1560, 630, and 36, respectively. Adding up these five powers produces the sum 2286.

For each game, find the minimum set of cubes that must have been present. What is the sum of the power of these sets?

Luckily, this task is made trivial by the approach we have taken before. We just have to multiply the green, red and blue columns:

txt['power'] = txt.loc[:, 'green'] * txt.loc[:, 'blue'] * txt.loc[:, 'red']
txt
codeidgreenredbluepower
01 green, 1 blue, 1 red; 1 green, 8 red, 7 blu…121010200
19 red, 7 green, 3 blue; 15 green, 2 blue, 5 r…215103450
23 red, 1 blue, 4 green; 6 red, 3 green, 2 blu…34616384
32 blue, 2 green, 19 red; 3 blue, 11 red, 16 g…41620185760
48 green, 1 red, 12 blue; 10 green, 6 red, 13 …510614840
952 red, 2 green, 1 blue; 1 red, 4 green; 1 green964218
964 red, 5 green; 5 blue, 3 red; 8 blue, 2 gree…97548160
971 blue; 2 green, 1 red; 5 red, 2 green; 4 red…9825220
986 blue, 5 red, 2 green; 9 red, 1 blue; 2 gree…992911198
991 blue, 13 green, 14 red; 11 green, 11 blue, …1001315112145

100 rows × 6 columns

And for this one, the sum is:

txt['power'].sum()

Advent of code 2023 - Tag 2: Würfelrätsel

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 2 - siehe https:adventofcode.com/2023/day/2

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2023 - Day 3: Gear Ratios

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 3 - see https:adventofcode.com/2023/day/3

Part 1

The engine schematic (your puzzle input) consists of a visual representation of the engine. There are lots of numbers and symbols you don’t really understand, but apparently any number adjacent to a symbol, even diagonally, is a “part number” and should be included in your sum. (Periods (.) do not count as a symbol.)

Here is an example engine schematic:

467..114..
...*......
..35..633.
......#...
617*......
.....+.58.
..592.....
......755.
...$.*....
.664.598..

In this schematic, two numbers are not part numbers because they are not adjacent to a symbol: 114 (top right) and 58 (middle right). Every other number is adjacent to a symbol and so is a part number; their sum is 4361.

Of course, the actual engine schematic is much larger. What is the sum of all of the part numbers in the engine schematic?

My input file: 2023-12-03-1-aoc.txt

Okay, let’s first get the input as a numpy character array

import numpy as np
import pandas as pd
import sys
import matplotlib.pyplot as plt
from scipy import ndimage as ndi
np.set_printoptions(threshold=sys.maxsize)

txt = pd.read_table('data/2023-12-03-1-aoc.txt', names=['code'])
arr = np.chararray((txt.size, txt.size), unicode=True)

txt['code'] = txt.loc[:, 'code'].apply(lambda x: [i for i in x])

for i, code in enumerate(txt['code']):
    arr[i, :] = code

print((arr[:15, :15]))

Now extract symbols, digits and the empty space. We use the numpy character methods for that. This way, we create a binary mask for all digits and a binary mask for all empty space (.). The symbols are then every character which is neither.

digits = np.char.isdigit(arr)
empty = np.char.endswith(arr, '.')
symbols = ~(digits | empty)

# just for visualization
plt.figure(figsize=(16, 5))
plt.subplot(131, title='symbols').matshow(symbols)
plt.subplot(132, title='digits').matshow(digits)
plt.subplot(133, title='empty').matshow(empty)
plt.show()

Now we use the image processing technique of dilation on the symbols mask. So that we get a new mask which covers the surroundings of all symbols. We use this afterwards to check if the digits are near a symbol.

struct = ((1, 1, 1), (1, 1, 1), (1, 1, 1))
dilate = ndi.binary_dilation(symbols, structure=struct)
plt.figure(figsize=(10, 6))
plt.subplot(121, title='symbols').matshow(symbols[:15, :15])
plt.subplot(122, title='symbols dilated').matshow(dilate[:15, :15])
plt.show()

Creating this masks as before could be understood as a binary segmentation, as each element in our mask is either True or False. To extract the single digits, we’ll convert the binary digits segmentation into a instance segmentation, where each connected segment has an own index.

markers, num_features = ndi.label(digits)
plt.figure(figsize=(10, 6))
plt.subplot(121, title='binary segmentation').matshow(
    digits[:15, :15])
plt.subplot(122, title='instance segmentation').matshow(
    markers[:15, :15], cmap='gnuplot')
plt.show()

Now, for each instance, we check if the dilated binary mask overlaps with the instance and if yes, we extract the number.

numbers = [int(''.join(arr[markers == i]))
           for i in range(1, num_features+1)
           if np.any((markers == i) & dilate)]

Then, we just sum up:

sum(numbers)

Part 2

The missing part wasn’t the only issue - one of the gears in the engine is wrong. A gear is any * symbol that is adjacent to exactly two part numbers. Its gear ratio is the result of multiplying those two numbers together.

This time, you need to find the gear ratio of every gear and add them all up so that the engineer can figure out which gear needs to be replaced.

Consider the same engine schematic again:The missing part wasn’t the only issue - one of the gears in the engine is wrong. A gear is any * symbol that is adjacent to exactly two part numbers. Its gear ratio is the result of multiplying those two numbers together.

This time, you need to find the gear ratio of every gear and add them all up so that the engineer can figure out which gear needs to be replaced.

Consider the same engine schematic again:

467..114..
...*......
..35..633.
......#...
617*......
.....+.58.
..592.....
......755.
...$.*....
.664.598..

In this schematic, there are two gears. The first is in the top left; it has part numbers 467 and 35, so its gear ratio is 16345. The second gear is in the lower right; its gear ratio is 451490. (The * adjacent to 617 is not a gear because it is only adjacent to one part number.) Adding up all of the gear ratios produces 467835.

What is the sum of all of the gear ratios in your engine schematic?

We’ll use the same method as before, but this time only extract * and create the instance segmentation before the dilation. Why? Because when we dilate first, we could merge two independent gears into one instance.

gear = np.char.endswith(arr, '*')
gear_markers, gear_num = ndi.label(gear)

plt.figure(figsize=(10, 6))
plt.subplot(131, title='all symbols dilated').matshow(
    symbols[:15, :15])
plt.subplot(132, title='gears').matshow(
    gear[:15, :15])
plt.subplot(133, title='gears instances').matshow(
    gear_markers[:15, :15], cmap='gnuplot')
plt.show()

Now, we step through each gear instance, create a mask for that gear, dilate it, and then step through all digits and check if they are in the gear mask. If we get two digits in the end, we multiply them to get the gear ratio and save the ratios to a list.

gear_ratios = []
for i in range(1, gear_num+1):
    gear_binary = gear_markers == i
    gear_dil = ndi.binary_dilation(gear_binary, structure=struct)
    part_numbers = [int(''.join(arr[markers == j]))
                    for j in range(1, num_features+1)
                    if np.any((markers == j) & gear_dil)]
    if len(part_numbers) == 2:
        gear_ratios.append(part_numbers[0] * part_numbers[1])

Now, we just sum the output again:

sum(gear_ratios)

Advent of code 2023 - Tag 3: Getriebeübersetzungen

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 3 - siehe https:adventofcode.com/2023/day/3

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2023 - Day 4: Scratchcards

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 4 - see https:adventofcode.com/2023/day/4

Part 1

The Elf leads you over to the pile of colorful cards. There, you discover dozens of scratchcards, all with their opaque covering already scratched off. Picking one up, it looks like each card has two lists of numbers separated by a vertical bar (|): a list of winning numbers and then a list of numbers you have. You organize the information into a table (your puzzle input).

As far as the Elf has been able to figure out, you have to figure out which of the numbers you have appear in the list of winning numbers. The first match makes the card worth one point and each match after the first doubles the point value of that card.

For example:

Card 1: 41 48 83 86 17 | 83 86  6 31 17  9 48 53
Card 2: 13 32 20 16 61 | 61 30 68 82 17 32 24 19
Card 3:  1 21 53 59 44 | 69 82 63 72 16 21 14  1
Card 4: 41 92 73 84 69 | 59 84 76 51 58  5 54 83
Card 5: 87 83 26 28 32 | 88 30 70 12 93 22 82 36
Card 6: 31 18 13 56 72 | 74 77 10 23 35 67 36 11

In the above example, card 1 has five winning numbers (41, 48, 83, 86, and 17) and eight numbers you have (83, 86, 6, 31, 17, 9, 48, and 53). Of the numbers you have, four of them (48, 83, 17, and 86) are winning numbers! That means card 1 is worth 8 points (1 for the first match, then doubled three times for each of the three matches after the first).

  • Card 2 has two winning numbers (32 and 61), so it is worth 2 points.
  • Card 3 has two winning numbers (1 and 21), so it is worth 2 points.
  • Card 4 has one winning number (84), so it is worth 1 point.
  • Card 5 has no winning numbers, so it is worth no points.
  • Card 6 has no winning numbers, so it is worth no points.

So, in this example, the Elf’s pile of scratchcards is worth 13 points.

Take a seat in the large pile of colorful cards. How many points are they worth in total?

My input file: 2023-12-04-1-aoc.txt

Loading this data is very similar to Day 2 - so let’s load the data as we did there. Our goal is to get win and yours columns holding the respective digits which we want to compare. We find the numbers with one or more digits using the regex \d+. And we want them to be in sets (not lists), as we can logically compare sets in Python.

import pandas as pd
import re

txt = pd.read_table('data/2023-12-04-1-aoc.txt', names=['win'])
txt['id'] = txt.loc[:, 'win'].str.split(':').apply(
    lambda x: int(x[0].strip('Card ')))
txt['win'] = (txt.loc[:, 'win']
              .str.split(':').apply(lambda x: x[1]))
txt['yours'] = (txt.loc[:, 'win']
                .str.split('|')
                .apply(lambda x: x[1])
                # get a list of only the numbers / digits
                .apply(lambda x: re.findall(r'\d+', x))
                # convert the list of strings to a set of integers
                .apply(lambda x: set([int(i) for i in x])))
txt['win'] = (txt.loc[:, 'win']
              .str.split('|')
              .apply(lambda x: x[0])
              .apply(lambda x: re.findall(r'\d+', x))
              .apply(lambda x: set([int(i) for i in x])))

txt
winidyours
0{32, 36, 7, 9, 10, 12, 82, 85, 95, 31}1{2, 7, 9, 10, 12, 14, 21, 22, 23, 24, 31, 32, …
1{35, 76, 16, 82, 19, 22, 88, 59, 60, 95}2{7, 8, 12, 16, 19, 22, 26, 28, 35, 38, 44, 51,…
2{1, 70, 11, 78, 48, 19, 52, 88, 28, 94}3{3, 4, 8, 17, 18, 19, 24, 31, 34, 45, 52, 54, …
3{65, 2, 72, 28, 14, 16, 55, 91, 92, 62}4{3, 4, 6, 7, 8, 9, 15, 30, 33, 35, 47, 49, 51,…
4{38, 41, 75, 77, 50, 24, 94, 60, 61, 30}5{1, 2, 4, 5, 6, 7, 9, 10, 14, 17, 21, 29, 47, …
213{97, 98, 39, 41, 43, 12, 13, 19, 93, 95}214{5, 10, 17, 20, 28, 29, 33, 34, 36, 50, 51, 52…
214{97, 35, 69, 40, 74, 45, 20, 21, 62, 31}215{1, 8, 15, 17, 18, 25, 30, 33, 42, 44, 47, 52,…
215{33, 70, 71, 12, 78, 17, 51, 86, 60, 94}216{7, 8, 9, 10, 22, 29, 37, 39, 41, 43, 46, 47, …
216{98, 67, 68, 38, 70, 39, 72, 77, 45, 21}217{8, 21, 22, 25, 26, 31, 37, 41, 42, 48, 54, 57…
217{34, 9, 44, 78, 79, 16, 17, 19, 55, 92}218{1, 4, 20, 21, 27, 38, 39, 40, 41, 45, 46, 52,…

218 rows × 3 columns

Now, we get the logical conjunction of win and yours, these are our winning numbers. Then, the number of wins is converted to points - for all number of wins bigger than 1, we can get the points by 2**(n_wins-1).

txt['n_wins'] = txt.apply(
    lambda row: len(row.loc['win'] & row.loc['yours']), axis=1)
txt['points'] = txt.loc[:, 'n_wins'].apply(
    lambda x: 2**(x-1) if x > 1 else x)

txt.loc[:, ['win', 'n_wins', 'points']]
winn_winspoints
0{32, 36, 7, 9, 10, 12, 82, 85, 95, 31}10512
1{35, 76, 16, 82, 19, 22, 88, 59, 60, 95}10512
2{1, 70, 11, 78, 48, 19, 52, 88, 28, 94}516
3{65, 2, 72, 28, 14, 16, 55, 91, 92, 62}00
4{38, 41, 75, 77, 50, 24, 94, 60, 61, 30}00
213{97, 98, 39, 41, 43, 12, 13, 19, 93, 95}00
214{97, 35, 69, 40, 74, 45, 20, 21, 62, 31}00
215{33, 70, 71, 12, 78, 17, 51, 86, 60, 94}22
216{98, 67, 68, 38, 70, 39, 72, 77, 45, 21}11
217{34, 9, 44, 78, 79, 16, 17, 19, 55, 92}00

218 rows × 3 columns

Then, we just sum up:

sum(txt.loc[:, 'points'])

Part 2

There’s no such thing as “points”. Instead, scratchcards only cause you to win more scratchcards equal to the number of winning numbers you have.

Specifically, you win copies of the scratchcards below the winning card equal to the number of matches. So, if card 10 were to have 5 matching numbers, you would win one copy each of cards 11, 12, 13, 14, and 15.

Copies of scratchcards are scored like normal scratchcards and have the same card number as the card they copied. So, if you win a copy of card 10 and it has 5 matching numbers, it would then win a copy of the same cards that the original card 10 won: cards 11, 12, 13, 14, and 15. This process repeats until none of the copies cause you to win any more cards. (Cards will never make you copy a card past the end of the table.)

This time, the above example goes differently:

Card 1: 41 48 83 86 17 | 83 86  6 31 17  9 48 53
Card 2: 13 32 20 16 61 | 61 30 68 82 17 32 24 19
Card 3:  1 21 53 59 44 | 69 82 63 72 16 21 14  1
Card 4: 41 92 73 84 69 | 59 84 76 51 58  5 54 83
Card 5: 87 83 26 28 32 | 88 30 70 12 93 22 82 36
Card 6: 31 18 13 56 72 | 74 77 10 23 35 67 36 11
  • Card 1 has four matching numbers, so you win one copy each of the next four cards: cards 2, 3, 4, and 5.
  • Your original card 2 has two matching numbers, so you win one copy each of cards 3 and 4.
  • Your copy of card 2 also wins one copy each of cards 3 and 4.
  • Your four instances of card 3 (one original and three copies) have two matching numbers, so you win four copies each of cards 4 and 5.
  • Your eight instances of card 4 (one original and seven copies) have one matching number, so you win eight copies of card 5.
  • Your fourteen instances of card 5 (one original and thirteen copies) have no matching numbers and win no more cards.

Once all of the originals and copies have been processed, you end up with 1 instance of card 1, 2 instances of card 2, 4 instances of card 3, 8 instances of card 4, 14 instances of card 5, and 1 instance of card 6. In total, this example pile of scratchcards causes you to ultimately have 30 scratchcards!

Process all of the original and copied scratchcards until no more scratchcards are won. Including the original set of scratchcards, how many total scratchcards do you end up with?

We will use the n_wins column we created before and go from there. We step through each Game. Each current game gets +1 card. Then, we step through the number of next games depending on our n_wins. Each of these gets added the card number of the current game.

txt['cards'] = 0

for i, nwin in enumerate(txt.loc[:, 'n_wins']):
    txt.loc[i, 'cards'] += 1
    for j in range(1, nwin+1):
        txt.loc[i+j, 'cards'] += txt.loc[i, 'cards']

txt.loc[:, ['n_wins', 'cards']]
n_winscards
0101
1102
254
308
408
21309608
21408927
21528927
216112636
217021564

218 rows × 2 columns

Now, we just sum the output again:

sum(txt['cards'])

Advent of code 2023 - Tag 4: Rubbelkarten

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 4 - siehe https:adventofcode.com/2023/day/4

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2023 - Day 5: If You Give A Seed A Fertilizer

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 5 - see https:adventofcode.com/2023/day/5

Update [2023-12-31 So]:

  • substitute get_generator() (own implementation) with range() (Python inbuilt)
  • improve grid search so that it goes through all location ranges, still starting with the lowest range

Part 1

The almanac (your puzzle input) lists all of the seeds that need to be planted. It also lists what type of soil to use with each kind of seed, what type of fertilizer to use with each kind of soil, what type of water to use with each kind of fertilizer, and so on. Every type of seed, soil, fertilizer and so on is identified with a number, but numbers are reused by each category - that is, soil 123 and fertilizer 123 aren’t necessarily related to each other.

For example:

seeds: 79 14 55 13

seed-to-soil map:
50 98 2
52 50 48

soil-to-fertilizer map:
0 15 37
37 52 2
39 0 15

fertilizer-to-water map:
49 53 8
0 11 42
42 0 7
57 7 4

water-to-light map:
88 18 7
18 25 70

light-to-temperature map:
45 77 23
81 45 19
68 64 13

temperature-to-humidity map:
0 69 1
1 0 69

humidity-to-location map:
60 56 37
56 93 4

The almanac starts by listing which seeds need to be planted: seeds 79, 14, 55, and 13.

The rest of the almanac contains a list of maps which describe how to convert numbers from a source category into numbers in a destination category. That is, the section that starts with seed-to-soil map: describes how to convert a seed number (the source) to a soil number (the destination). This lets the gardener and his team know which soil to use with which seeds, which water to use with which fertilizer, and so on.

Rather than list every source number and its corresponding destination number one by one, the maps describe entire ranges of numbers that can be converted. Each line within a map contains three numbers: the destination range start, the source range start, and the range length.

Consider again the example seed-to-soil map:

50 98 2 52 50 48

The first line has a destination range start of 50, a source range start of 98, and a range length of 2. This line means that the source range starts at 98 and contains two values: 98 and 99. The destination range is the same length, but it starts at 50, so its two values are 50 and 51. With this information, you know that seed number 98 corresponds to soil number 50 and that seed number 99 corresponds to soil number 51. o The second line means that the source range starts at 50 and contains 48 values: 50, 51, …, 96, 97. This corresponds to a destination range starting at 52 and also containing 48 values: 52, 53, …, 98, 99. So, seed number 53 corresponds to soil number 55.

Any source numbers that aren’t mapped correspond to the same destination number. So, seed number 10 corresponds to soil number 10.

So, the entire list of seed numbers and their corresponding soil numbers looks like this:

seed  soil
0     0
1     1
...   ...
48    48
49    49
50    52
51    53
...   ...
96    98
97    99
98    50
99    51

With this map, you can look up the soil number required for each initial seed number:

  • Seed number 79 corresponds to soil number 81.
  • Seed number 14 corresponds to soil number 14.
  • Seed number 55 corresponds to soil number 57.
  • Seed number 13 corresponds to soil number 13.

The gardener and his team want to get started as soon as possible, so they’d like to know the closest location that needs a seed. Using these maps, find the lowest location number that corresponds to any of the initial seeds. To do this, you’ll need to convert each seed number through other categories until you can find its corresponding location number. In this example, the corresponding types are:

  • Seed 79, soil 81, fertilizer 81, water 81, light 74, temperature 78, humidity 78, location 82.
  • Seed 14, soil 14, fertilizer 53, water 49, light 42, temperature 42, humidity 43, location 43.
  • Seed 55, soil 57, fertilizer 57, water 53, light 46, temperature 82, humidity 82, location 86.
  • Seed 13, soil 13, fertilizer 52, water 41, light 34, temperature 34, humidity 35, location 35.

So, the lowest location number in this example is 35.

What is the lowest location number that corresponds to any of the initial seed numbers?

My input file: 2023-12-05-1-aoc.txt

Wow, this task is a mouthful…

Let’s start slowly and load the data. Our input text document contains several maps, which are clearly separated and have a title (seed-to-soil map etc). So we can tell pandas where each map starts and give each map a dataframe. I got the values for the skiprows and nrows argument by looking at the input file and… counting :)

import pandas as pd
import sys

seeds = pd.read_table('data/2023-12-05-1-aoc.txt', nrows=1, sep=' ',
                      header=None, index_col=0)
seeds = seeds.values.flatten()

map_opt = {'filepath_or_buffer': 'data/2023-12-05-1-aoc.txt',
           'header': None, 'sep': ' ', 'dtype': 'Int64',
           'names': ['dest_start', 'src_start', 'range']}

seed_soil = pd.read_table(skiprows=3, nrows=23, **map_opt)
soil_fert = pd.read_table(skiprows=28, nrows=9, **map_opt)
fert_water = pd.read_table(skiprows=39, nrows=20, **map_opt)
water_light = pd.read_table(skiprows=61, nrows=40, **map_opt)
light_temp = pd.read_table(skiprows=103, nrows=36, **map_opt)
temp_humi = pd.read_table(skiprows=141, nrows=35, **map_opt)
humi_loc = pd.read_table(skiprows=178, nrows=26, **map_opt)

maps = (seed_soil, soil_fert, fert_water, water_light, light_temp,
        temp_humi, humi_loc)

print('seeds are just a numpy array:')
display(seeds)
print('The "humidity-to-location" map as an example:')
humi_loc
dest_startsrc_startrange
034901440031623866227218040905
1170961057816208391973027030
2105449249586389428113279526
318996043382167886292348178199
417126376082678624215186966730
52187287750245776251
62472992580923734334143670388
72616662968367016988515297294
822477825371395629154225210043
90480940179105449249
1041138528463959729057159909096
1133227846531067404722167359350
129237343344119638153175329143
1315349644962516064491162559724
1426319602623496028440140849733
152862695906123476407297988355
161697524220363687817312086358
172985646048133275242762876727
1810990634772970241510435901019
1940092022812865590945104650565
202960684261214292450524961787
212772809995340614252989885911
224273761942364896453121205354
2337081849081841907132301017373
24464505026245776251235163928
2530485227753685467179274261878

My first attempt was to actually construct ranges like in the example above, mapping out all possible sources and destinations. Python quickly informed me that even constructing one pandas.Series with int64 values mapping seeds to soil would cost 64GB memory - not the best solution.

So we take a different approach. For convenience, let’s add a src_end and a dest_end column to our maps:

for df in maps:
    df['src_end'] = df.loc[:, 'src_start'] + df.loc[:, 'range']
    df['dest_end'] = df.loc[:, 'dest_start'] + df.loc[:, 'range']

print('Again the "humidity-to-location" map as an example:')
humi_loc
dest_startsrc_startrangesrc_enddest_end
03490144003162386622721804090518419071323708184908
117096105781620839197302703016238662271712637608
2105449249586389428113279526699668954218728775
31899604338216788629234817819925160644912247782537
41712637608267862421518696673028655909451899604338
52187287750245776251245776251464505026
6247299258092373433414367038810674047222616662968
7261666296836701698851529729436854671792631960262
82247782537139562915422521004316208391972472992580
90480940179105449249586389428105449249
104113852846395972905715990909641196381534273761942
113322784653106740472216735935012347640723490144003
12923734334411963815317532914342949672961099063477
131534964496251606449116255972426786242151697524220
142631960262349602844014084973336368781732772809995
15286269590612347640729798835513327524272960684261
16169752422036368781731208635836489645311709610578
17298564604813327524276287672713956291543048522775
181099063477297024151043590101934061425291534964496
194009202281286559094510465056529702415104113852846
20296068426121429245052496178721678862922985646048
21277280999534061425298988591134960284402862695906
22427376194236489645312120535436701698854294967296
233708184908184190713230101737321429245054009202281
24464505026245776251235163928480940179699668954
253048522775368546717927426187839597290573322784653

Now we actually compute the mapping. For each seed, we go through all mappings and in each mapping we go through each row. We find the row which contains the mapping and exctract the destination, which is the source for the next map until we reach the last map which gives us the locations.

  • Approach 1: df.itertuples() is a convenient way to step through a pandas.DataFrame in this example. It is faster than df.iterrows() and returns the row as a NamedTuple - nice!
  • Approach 2: I actually wondered if it would be faster to get all maps in one pd.DataFrame and then iterate through the mappings. To test this let’s construct a new DataFrame maps_df which contains all maps. Since the maps have different lengths it is important to cast the datatype to Int64, which is short for pd.Int64Dtype() and keeps values as integers, even if NA values are in the same column.
  • Approach 3: A third alternative I tested (not shown here) was to check if a value is in a Python range with the in operator as in: if 3 in range(5):... . This was way too slow.
# mapping version 1
def get_location(seed):
    current = seed
    for df in maps:
        current_map = [row
                       for row in df.itertuples()
                       if ((current > row.src_start)
                           and (current < row.src_end))]
        if len(current_map) == 0:
            pass
        elif len(current_map) == 1:
            current = (current_map[0].dest_start
                       + (current - current_map[0].src_start))
        else:
            raise ValueError('This should not happen!')
    return current

# mapping version 2 - around 10 times slower
# you need to rename the maps_df columns so that they have a unique id
# e.g. 'src_start_1', 'src_start_2' etc

# maps_df = pd.concat(maps, axis='columns')

# def get_dest(i, src):
#     return (maps_df[(src > maps_df.loc[:, f'src_start_{i}']) &
#                     (src < maps_df.loc[:, f'src_end_{i}'])]
#             .loc[:, f'dest_start_{i}']
#             .iloc[0])

# def get_location2(seed):
#     dest = seed
#     i = 1
#     while i < 7:
#         dest = get_dest(i, dest)
#         i += 1
#     return dest

%timeit get_location(seeds[0])

Now let’s get a list of locations:

locations = [get_location(s) for s in seeds]
locations

Lastly, just get the minimum of all location values.

min(locations)

Part 2

Everyone will starve if you only plant such a small number of seeds. Re-reading the almanac, it looks like the seeds: line actually describes ranges of seed numbers.

The values on the initial seeds: line come in pairs. Within each pair, the first value is the start of the range and the second value is the length of the range. So, in the first line of the example above:

seeds: 79 14 55 13

This line describes two ranges of seed numbers to be planted in the garden. The first range starts with seed number 79 and contains 14 values: 79, 80, …, 91, 92. The second range starts with seed number 55 and contains 13 values: 55, 56, …, 66, 67.

Now, rather than considering four seed numbers, you need to consider a total of 27 seed numbers.

In the above example, the lowest location number can be obtained from seed number 82, which corresponds to soil 84, fertilizer 84, water 84, light 77, temperature 45, humidity 46, and location 46. So, the lowest location number is 46.

Consider all of the initial seed numbers listed in the ranges on the first line of the almanac. What is the lowest location number that corresponds to any of the initial seed numbers?

Let’s first construct a dataframe containing the range of seeds:

seeds_df = pd.DataFrame({'start': seeds[::2],
                         'range': seeds[1::2],
                         'end': seeds[::2] + seeds[1::2]})
print(f'There are {sum(seeds_df.loc[:, "range"]):_} seeds in total')
seeds_df

startrangeend
031691377002717176093440855309
13522125441233760953545501536
212339487998118338372045782636
3280549587703867355984416942
416608652844766996210853524
52326968141691622222396130363
62698492851146030692713095920
727553276673489995313104327198
82600461189923328462692794035
910546569691690997671223756736

Now - I really needed some time to finally realize, that going through all seed values is really unfeasable. So how do we deal with this problem?

In the end we need the lowest location number - thus our approach is to take the humi_loc map, start with the lowest location number and go up and get the corresponding seed values. The location of the first seed value which is inside seeds_df is our solution.

So first we rebuild the get_location function to a get_seed function (we reverse the maps tuple with maps[::-1] and switch src and dest).

def get_seed(location):
    current = location
    for df in maps[::-1]:
        current_map = [row
                       for row in df.itertuples()
                       if ((current >= row.dest_start)
                           and (current < row.dest_end))]
        if len(current_map) == 0:
            pass
        elif len(current_map) == 1:
            current = (current_map[0].src_start
                       + (current - current_map[0].dest_start))
        else:
            raise ValueError('This should not happen!')
    return current

Since the Python 3 range function does not store it’s contents in memory (similar to a generator), it is well suited to go through these large ranges.

Lastly, we deal with the sheer amount of possible values by performing a grid search. First, we check every millionth location. After the first match, we stop this search and check the last million locations before the match with a finer grid and so on. The last grid is just 1, so we find our lowest location.

Update: We order our locations from smallest to largest with sort_values and go through them - but the search stops after the first match, since that will be our lowest location.

def grid_search(start: int, end: int, grid: list[int]):
    success = False
    for g in grid:
        print(f'Start search with grid={g}')
        for l in range(start, end, g):
            current = get_seed(l)
            if any((current >= seeds_df.loc[:, 'start']) & (current < seeds_df.loc[:, 'end'])):
                print(f'location {l} is the lowest which contains one of the given seeds ({current})')
                start = l - g
                end = l
                success = True
                break
    return success

for row in humi_loc.sort_values('dest_start').itertuples():
    success = grid_search(start=row.dest_start, end=row.dest_end,
                          grid=[1_000_000, 100_000, 1000, 1])
    if success:
        print('Finished search')
        break

Advent of code 2023 - Tag 5: Wenn Du Einem Samen Dünger Gibst

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 5 - siehe https:adventofcode.com/2023/day/5

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2023 - Day 6: Wait For It

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 6 - see https:adventofcode.com/2023/day/6

Part 1

You manage to sign up as a competitor in the boat races just in time. The organizer explains that it’s not really a traditional race - instead, you will get a fixed amount of time during which your boat has to travel as far as it can, and you win if your boat goes the farthest.

As part of signing up, you get a sheet of paper (your puzzle input) that lists the time allowed for each race and also the best distance ever recorded in that race. To guarantee you win the grand prize, you need to make sure you go farther in each race than the current record holder.

The organizer brings you over to the area where the boat races are held. The boats are much smaller than you expected - they’re actually toy boats, each with a big button on top. Holding down the button charges the boat, and releasing the button allows the boat to move. Boats move faster if their button was held longer, but time spent holding the button counts against the total race time. You can only hold the button at the start of the race, and boats don’t move until the button is released.

For example:

Time:      7  15   30
Distance:  9  40  200

This document describes three races:

  • The first race lasts 7 milliseconds. The record distance in this race is 9 millimeters.
  • The second race lasts 15 milliseconds. The record distance in this race is 40 millimeters.
  • The third race lasts 30 milliseconds. The record distance in this race is 200 millimeters.

Your toy boat has a starting speed of zero millimeters per millisecond. For each whole millisecond you spend at the beginning of the race holding down the button, the boat’s speed increases by one millimeter per millisecond.

So, because the first race lasts 7 milliseconds, you only have a few options:

  • Don’t hold the button at all (that is, hold it for =0= milliseconds) at the start of the race. The boat won’t move; it will have traveled =0= millimeters by the end of the race.
  • Hold the button for =1= millisecond at the start of the race. Then, the boat will travel at a speed of 1 millimeter per millisecond for 6 milliseconds, reaching a total distance traveled of =6= millimeters.
  • Hold the button for =2= milliseconds, giving the boat a speed of 2 millimeters per millisecond. It will then get 5 milliseconds to move, reaching a total distance of =10= millimeters.
  • Hold the button for =3= milliseconds. After its remaining 4 milliseconds of travel time, the boat will have gone =12= millimeters.
  • Hold the button for =4= milliseconds. After its remaining 3 milliseconds of travel time, the boat will have gone =12= millimeters.
  • Hold the button for =5= milliseconds, causing the boat to travel a total of =10= millimeters.
  • Hold the button for =6= milliseconds, causing the boat to travel a total of =6= millimeters.
  • Hold the button for =7= milliseconds. That’s the entire duration of the race. You never let go of the button. The boat can’t move until you let go of the button. Please make sure you let go of the button so the boat gets to move. =0= millimeters.

Since the current record for this race is 9 millimeters, there are actually =4= different ways you could win: you could hold the button for 2, 3, 4, or 5 milliseconds at the start of the race.

In the second race, you could hold the button for at least 4 milliseconds and at most 11 milliseconds and beat the record, a total of =8= different ways to win.

In the third race, you could hold the button for at least 11 milliseconds and no more than 19 milliseconds and still beat the record, a total of =9= ways you could win.

To see how much margin of error you have, determine the number of ways you can beat the record in each race; in this example, if you multiply these values together, you get =288= (4 * 8 * 9).

Determine the number of ways you could beat the record in each race. What do you get if you multiply these numbers together?

My input file: 2023-12-06-1-aoc.txt

The solution for this problem is shorter than the task description!

As always - let’s load the data. Note: we use the delim_whitesspace argument instead of sep=' ' to separate values by spaces of any length.

import pandas as pd
import numpy as np

race = pd.read_table('data/2023-12-06-1-aoc.txt', delim_whitespace=True,
                     header=None, index_col=0)
race.index = ['time', 'distance']
race
1234
time51926890
distance222203111261225

The we use Python list comprehensions to quickly calculate all distances for the range of the time value (and keep only the winning distances). Then we get the respective lengths and compute the product.

def get_distance(time, distance):
    return [t * (time - t) for t in range(1, time + 1)
            if (t * (time - t)) > distance]

race.loc['nwins', :] = [len(get_distance(int(r.time), int(r.distance)))
                        for c, r in race.items()]
nwins_prod = np.prod(race.loc['nwins', :].values, dtype=int)
display(race)
print(f'The product of our nwins is {nwins_prod}')
1234
time51.092.068.090.0
distance222.02031.01126.01225.0
nwins42.019.011.057.0
The product of our nwins is 500346

Part 2

As the race is about to start, you realize the piece of paper with race times and record distances you got earlier actually just has very bad kerning. There’s really only one race - ignore the spaces between the numbers on each line.

So, the example from before:

Time:      7  15   30
Distance:  9  40  200

…now instead means this:

Time:      71530
Distance:  940200

Now, you have to figure out how many ways there are to win this single race. In this example, the race lasts for =71530= milliseconds and the record distance you need to beat is =940200= millimeters. You could hold the button anywhere from 14 to 71516 milliseconds and beat the record, a total of =71503= ways!

How many ways can you beat the record in this one much longer race?

For this riddle we can take the same approach as for Part 1. The cell took 18s to evaluate on a consumer laptop from 2015 - no fancy workarounds needed :)

racetime = int(''.join(str(r) for r in race.loc['time', :4].astype(int)))
racedist = int(''.join(str(r) for r in race.loc['distance', :4].astype(int)))
racewins = len(get_distance(racetime, racedist))

race[5] = [racetime, racedist, racewins]
race
12345
time51.092.068.090.051926890
distance222.02031.01126.01225.0222203111261225
nwins42.019.011.057.042515755

Advent of code 2023 - Tag 6: Warte Darauf

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 6 - siehe https:adventofcode.com/2023/day/6

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2023 - Day 7: Camel Cards

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 7 - see https:adventofcode.com/2023/day/7

Part 1

In Camel Cards, you get a list of hands, and your goal is to order them based on the strength of each hand. A hand consists of five cards labeled one of A, K, Q, J, T, 9, 8, 7, 6, 5, 4, 3, or 2. The relative strength of each card follows this order, where A is the highest and 2 is the lowest.

Every hand is exactly one type. From strongest to weakest, they are:

  • Five of a kind, where all five cards have the same label: AAAAA
  • Four of a kind, where four cards have the same label and one card has a different label: AA8AA
  • Full house, where three cards have the same label, and the remaining two cards share a different label: 23332
  • Three of a kind, where three cards have the same label, and the remaining two cards are each different from any other card in the hand: TTT98
  • Two pair, where two cards share one label, two other cards share a second label, and the remaining card has a third label: 23432
  • One pair, where two cards share one label, and the other three cards have a different label from the pair and each other: A23A4
  • High card, where all cards’ labels are distinct: 23456

Hands are primarily ordered based on type; for example, every full house is stronger than any three of a kind.

If two hands have the same type, a second ordering rule takes effect. Start by comparing the first card in each hand. If these cards are different, the hand with the stronger first card is considered stronger. If the first card in each hand have the same label, however, then move on to considering the second card in each hand. If they differ, the hand with the higher second card wins; otherwise, continue with the third card in each hand, then the fourth, then the fifth.

So, 33332 and 2AAAA are both four of a kind hands, but 33332 is stronger because its first card is stronger. Similarly, 77888 and 77788 are both a full house, but 77888 is stronger because its third card is stronger (and both hands have the same first and second card).

To play Camel Cards, you are given a list of hands and their corresponding bid (your puzzle input). For example:

32T3K 765
T55J5 684
KK677 28
KTJJT 220
QQQJA 483

This example shows five hands; each hand is followed by its bid amount. Each hand wins an amount equal to its bid multiplied by its rank, where the weakest hand gets rank 1, the second-weakest hand gets rank 2, and so on up to the strongest hand. Because there are five hands in this example, the strongest hand will have rank 5 and its bid will be multiplied by 5.

So, the first step is to put the hands in order of strength:

  • 32T3K is the only one pair and the other hands are all a stronger type, so it gets rank 1.
  • KK677 and KTJJT are both two pair. Their first cards both have the same label, but the second card of KK677 is stronger (K vs T), so KTJJT gets rank 2 and KK677 gets rank 3.
  • T55J5 and QQQJA are both three of a kind. QQQJA has a stronger first card, so it gets rank 5 and T55J5 gets rank 4.

Now, you can determine the total winnings of this set of hands by adding up the result of multiplying each hand’s bid with its rank (765 * 1 + 220 * 2 + 28 * 3 + 684 * 4 + 483 * 5). So the total winnings in this example are =6440=.

Find the rank of every hand in your set. What are the total winnings?

My input file: 2023-12-07-1-aoc.txt

As always - let’s read in the data.

import pandas as pd

camel = pd.read_table('data/2023-12-07-1-aoc.txt', delim_whitespace=True,
                      header=None, names=['cards', 'bid'],
                      dtype={'cards': 'string', 'bid': 'Int64'})
camel
cardsbid
0A833A309
1Q33J3205
255KK5590
3K4457924
4Q3QT311
995KQJ53367
99657866537
997Q94A9210
998J448A903
9996J66J114

1000 rows × 2 columns

For sorting, we need to compute a type column. The 7 types can be deduced based on how many different cards are there (through set()) and the number of cards.

Also, for the next step we prepare one column per card string in the cards.

def get_type(cards):
    cardset = set(cards)
    cardlen = len(cardset)
    cardlist = list(cards)
    cardnum = [cardlist.count(i) for i in cardset]
    if cardlen == 1:
        ctype = 1
    elif cardlen in (4, 5):
        ctype = cardlen + 2
    elif cardlen == 2:
        ctype = 2 if set(cardnum) == {1, 4} else 3
    else:
        ctype = 4 if 3 in cardnum else 5
    return ctype

camel['type'] = camel.loc[:, 'cards'].apply(lambda x: get_type(x))
for i in range(5):
    camel[f'c{i}'] = camel.loc[:, 'cards'].apply(lambda x: list(x)[i])
camel
cardsbidtypec0c1c2c3c4
0A833A3095A833A
1Q33J32054Q33J3
255KK5590355KK5
3K44579246K4457
4Q3QT3115Q3QT3
995KQJ533677KQJ53
99657866537657866
997Q94A92106Q94A9
998J448A9036J448A
9996J66J11436J66J

1000 rows × 8 columns

For sorting, pandas contributes the sort_values method, which supports sorting by multiple columns (here, we first sort according to type, then c0, c1, …). For easier sorting by numerical values, we convert the values using the card_dict dictionary, which implements the given sorting order.

card_dict = {c: i for i, c in enumerate([7, 6, 5, 4, 3, 2, 1, '2', '3',
                                         '4', '5', '6', '7', '8', '9',
                                         'T', 'J', 'Q', 'K', 'A'])}

camel = camel.sort_values(by=['type', 'c0', 'c1', 'c2', 'c3', 'c4'],
                          ignore_index=True,
                          key=lambda x: x.map(card_dict))
camel.index += 1
camel
cardsbidtypec0c1c2c3c4
1237AQ1577237AQ
223K47759723K47
3249K86127249K8
4264A73417264A7
52657810726578
996AA5AA5652AA5AA
997AAATA7042AAATA
998AAAA31452AAAA3
999AAAAJ5942AAAAJ
1000JJJJJ2191JJJJJ

1000 rows × 8 columns

Lastly, we multiply the bid with the index and sum the result.

camel.loc[:, 'bid'].mul(camel.index).sum()

Part 2

To make things a little more interesting, the Elf introduces one additional rule. Now, J cards are jokers - wildcards that can act like whatever card would make the hand the strongest type possible.

To balance this, =J= cards are now the weakest individual cards, weaker even than 2. The other cards stay in the same order: A, K, Q, T, 9, 8, 7, 6, 5, 4, 3, 2, J.

J cards can pretend to be whatever card is best for the purpose of determining hand type; for example, QJJQ2 is now considered four of a kind. However, for the purpose of breaking ties between two hands of the same type, J is always treated as J, not the card it’s pretending to be: JKKK2 is weaker than QQQQ2 because J is weaker than Q.

Now, the above example goes very differently:

32T3K 765
T55J5 684
KK677 28
KTJJT 220
QQQJA 483
  • 32T3K is still the only one pair; it doesn’t contain any jokers, so its strength doesn’t increase.
  • KK677 is now the only two pair, making it the second-weakest hand.
  • T55J5, KTJJT, and QQQJA are now all four of a kind! T55J5 gets rank 3, QQQJA gets rank 4, and KTJJT gets rank 5.

With the new joker rule, the total winnings in this example are =5905=.

Using the new joker rule, find the rank of every hand in your set. What are the new total winnings?

What changes? We have to update our get_type() function and our card_dict dictionary, otherwise the method stays the same!

Let’s first write get_type_joker(): if no joker is in the cards, we can just refer to get_type(). If we have 1, 2 or 5 different card faces, the answer is trivial. We only have to use the number of jokers and the maximum number of other faces if we have 3 or 4 different faces.

Note: if a J is present, there can never be high card (type = 7), because we always can get one pair. Also we can never get two pair (type = 5), because we can always construct three of a kind or full house.

def get_type_joker(cards):
    cardset = set(cards)
    cardlen = len(cardset)
    cardlist = list(cards)
    cardnum = [cardlist.count(i) for i in cardset]
    if not 'J' in cardset:
        ctype = get_type(cards)
    else:
        idxj = list(cardset).index('J')
        numj = cardnum[idxj]
        cardnum.pop(idxj)
        if cardlen in [1, 2]:
            ctype = 1
        elif cardlen == 5:
            ctype = 6
        else:
            if (numj in [1, 2]) and (cardlen == 4):
                ctype = 4
            else:
                ctype = 6 - max(cardnum) - numj
    return ctype

camel['type'] = camel.loc[:, 'cards'].apply(lambda x: get_type_joker(x))

camel
cardsbidtypec0c1c2c3c4
1237AQ1577237AQ
223K47759723K47
3249K86127249K8
4264A73417264A7
52657810726578
996AA5AA5652AA5AA
997AAATA7042AAATA
998AAAA31452AAAA3
999AAAAJ5941AAAAJ
1000JJJJJ2191JJJJJ

1000 rows × 8 columns

Now, we just move J in our new card_dict_joker to where it belongs, the rest of the solution is the same.

card_dict_joker = {c: i for i, c in enumerate(
    [7, 6, 5, 4, 3, 2, 1, 'J', '2', '3', '4', '5', '6', '7', '8', '9',
     'T', 'Q', 'K', 'A'])}

camel = camel.sort_values(by=['type', 'c0', 'c1', 'c2', 'c3', 'c4'],
                          ignore_index=True,
                          key=lambda x: x.map(card_dict_joker))
camel.index += 1
display(camel)
print('Our new solution: ', camel.loc[:, 'bid'].mul(camel.index).sum())
cardsbidtypec0c1c2c3c4
1237AQ1577237AQ
223K47759723K47
3249K86127249K8
4264A73417264A7
52657810726578
996TJTTJ6091TJTTJ
997QQJQJ1311QQJQJ
998QQQJQ8311QQQJQ
999KKKJK1831KKKJK
1000AAAAJ5941AAAAJ

1000 rows × 8 columns

Our new solution:  246894760

Advent of code 2023 - Tag 7: Kamelkarten

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 7 - siehe https:adventofcode.com/2023/day/7

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2024 - Day 1: Historian Hysteria

I try to solve this year’s Advent of Code 2024 riddles in R. This is Day 1 - see https:adventofcode.com/2024/day/1

Part 1: Locations

Throughout the Chief’s office, the historically significant locations are listed not by name but by a unique number called the location ID. To make sure they don’t miss anything, The Historians split into two groups, each searching the office and trying to create their own complete list of location IDs.

There’s just one problem: by holding the two lists up side by side (your puzzle input), it quickly becomes clear that the lists aren’t very similar. Maybe you can help The Historians reconcile their lists?

For example:

3   4
4   3
2   5
1   3
3   9
3   3

Maybe the lists are only off by a small amount! To find out, pair up the numbers and measure how far apart they are. Pair up the smallest number in the left list with the smallest number in the right list, then the second-smallest left number with the second-smallest right number, and so on.

Within each pair, figure out how far apart the two numbers are; you’ll need to add up all of those distances. For example, if you pair up a 3 from the left list with a 7 from the right list, the distance apart is 4; if you pair up a 9 with a 3, the distance apart is 6.

In the example list above, the pairs and distances would be as follows:

  • The smallest number in the left list is 1, and the smallest number in the right list is 3. The distance between them is =2=.
  • The second-smallest number in the left list is 2, and the second-smallest number in the right list is another 3. The distance between them is =1=.
  • The third-smallest number in both lists is 3, so the distance between them is =0=.
  • The next numbers to pair up are 3 and 4, a distance of =1=.
  • The fifth-smallest numbers in each list are 3 and 5, a distance of =2=.
  • Finally, the largest number in the left list is 4, while the largest number in the right list is 9; these are a distance =5= apart.

To find the total distance between the left list and the right list, add up the distances between all of the pairs you found. In the example above, this is 2 + 1 + 0 + 1 + 2 + 5, a total distance of =11=!

Your actual left and right lists contain many location IDs. What is the total distance between your lists?

My input file: 2024-12-01-1-aoc.txt

First, we read the data - the fread command from the data.table package is more versatile than read.delim from base and directly reads the data as a data.table, which has some benefits.

data <- data.table::fread("2024-12-01-1-aoc.txt", header = FALSE)
head(data)
V1V2
<int><int>
4122669190
8931810100
5941923880
6315720193
8151022869
8394263304

Now, the idea is to get the ordered versions of V1 and V2.

v1 <- data[order(V1)]$V1
v2 <- data[order(V2)]$V2
head(v1)

Coming from Python, this syntax looks a bit odd. The documentation of data.table::setorder helps:

setorder (and setorderv) reorders the rows of a data.table based on the columns (and column
order) provided. It reorders the table by reference and is therefore very memory efficient.

Note that queries like x[order(.)] are optimised internally to use data.table's fast order.

So, data[order(V1)] is actually short for data.table::setorder(data, V1). Then, we extract the vector by name using the $ operator, which allows to extract elements by name.

head(data.table::setorder(data, V1)$V1)

The actual computation is just the sum of the absolute difference:

sum(abs(v1 - v2))

Part 2: Similarities

This time, you’ll need to figure out exactly how often each number from the left list appears in the right list. Calculate a total similarity score by adding up each number in the left list after multiplying it by the number of times that number appears in the right list.

Here are the same example lists again:

3   4
4   3
2   5
1   3
3   9
3   3

For these example lists, here is the process of finding the similarity score:

  • The first number in the left list is 3. It appears in the right list three times, so the similarity score increases by 3 * 3 = 9.
  • The second number in the left list is 4. It appears in the right list once, so the similarity score increases by 4 * 1 = 4.
  • The third number in the left list is 2. It does not appear in the right list, so the similarity score does not increase (2 * 0 = 0).
  • The fourth number, 1, also does not appear in the right list.
  • The fifth number, 3, appears in the right list three times; the similarity score increases by =9=.
  • The last number, 3, appears in the right list three times; the similarity score again increases by =9=.

So, for these example lists, the similarity score at the end of this process is =31= (9 + 4 + 0 + 0 + 9 + 9).

Once again consider your left and right lists. What is their similarity score?

My idea here is simple - we first count the occurrences of V2 to be able to check V1 against them. Here, applying the table command works like pandas.value_counts() and achieves this. We convert the output to a data.frame and assign the V1 values as row names. Note: table converts the values which are counted to string, e.g. "10019" instead of 10019.

tab <- table(v2)
v2series <- data.frame(c(tab), row.names = names(tab))
head(v2series)
v2series["10019", ]
c.tab.
<int>
100191
101001
102061
104281
106451
109721

1

Now, we loop through V1 and try to access the frequency in V2 (by converting to string first, via as.character). If nothing is found in V2, an NA value is returned. For everything that is not NA, we compute the similarity score, append it to a list, and sum in the end. Note: a list in R can not directly be summed (I don’t know why that is) - so we have to unlist first.

sim <- list()
for (num in v1) {
    myfreq <- v2series[as.character(num), ]
    if (!is.na(myfreq)) {
        score <- myfreq * num
        sim <- append(sim, score)
    }

}
sum(unlist(sim))

Advent of code 2024 - Tag 1: Hysterie der Historiker:innen

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2024 in der auf Statistik fokussierten Programmiersprache R. Dies ist Tag 1 - siehe https:adventofcode.com/2024/day/1

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2024 - Day 2: Red-Nosed Reports

I try to solve this year’s Advent of Code 2024 riddles in R. This is Day 2 - see https:adventofcode.com/2024/day/2

Part 1: Safe reports

The unusual data (your puzzle input) consists of many reports, one report per line. Each report is a list of numbers called levels that are separated by spaces. For example:

7 6 4 2 1
1 2 7 8 9
9 7 6 2 1
1 3 2 4 5
8 6 4 4 1
1 3 6 7 9

This example data contains six reports each containing five levels.

The engineers are trying to figure out which reports are safe. The Red-Nosed reactor safety systems can only tolerate levels that are either gradually increasing or gradually decreasing. So, a report only counts as safe if both of the following are true:

  • The levels are either all increasing or all decreasing.
  • Any two adjacent levels differ by at least one and at most three.

In the example above, the reports can be found safe or unsafe by checking those rules:

  • 7 6 4 2 1: Safe because the levels are all decreasing by 1 or 2.
  • 1 2 7 8 9: Unsafe because 2 7 is an increase of 5.
  • 9 7 6 2 1: Unsafe because 6 2 is a decrease of 4.
  • 1 3 2 4 5: Unsafe because 1 3 is increasing but 3 2 is decreasing.
  • 8 6 4 4 1: Unsafe because 4 4 is neither an increase or a decrease.
  • 1 3 6 7 9: Safe because the levels are all increasing by 1, 2, or 3.

So, in this example, =2= reports are safe.

Analyze the unusual data from the engineers. How many reports are safe?

My input file: 2024-12-02-1-aoc.txt

First, let’s load the data and take a look:

data <- data.table::fread("2024-12-02-1-aoc.txt", header = FALSE, fill = TRUE)
head(data)
V1V2V3V4V5V6V7V8
<int><int><int><int><int><int><int><int>
7476787976NANANA
3840434444NANANA
12468913NA
65687072757681NA
899192959394NANA
151716181917NANA

My core idea here was to compute the differences between each element in a row - luckily, R provides an inbuilt diff() function that provides exactly what I need. Then, we check if all differences are either 1, 2, 3 or -1, -2, -3. Note that each row has a different number of elements, so we have to omit NA values via na.omit when applying the check.

isalltrue <- function(rep) {
    return(all(diff(rep) %in% c(1, 2, 3)) |
           all(diff(rep) %in% c(-1, -2, -3)))
}

isreport <- function(row) {
    return(isalltrue(na.omit(row)))
}
valid.reports <- apply(data, 1, isreport)
table(valid.reports)

Part 2: Problem Dampener

The engineers are surprised by the low number of safe reports until they realize they forgot to tell you about the Problem Dampener.

The Problem Dampener is a reactor-mounted module that lets the reactor safety systems tolerate a single bad level in what would otherwise be a safe report. It’s like the bad level never happened!

Now, the same rules apply as before, except if removing a single level from an unsafe report would make it safe, the report instead counts as safe.

More of the above example’s reports are now safe:

  • -=7 6 4 2 1=: Safe without removing any level.
  • -=1 2 7 8 9=: Unsafe regardless of which level is removed.
  • -=9 7 6 2 1=: Unsafe regardless of which level is removed.
  • -=1 3 2 4 5=: Safe by removing the second level, 3.
  • -=8 6 4 4 1=: Safe by removing the third level, 4.
  • -=1 3 6 7 9=: Safe without removing any level.

Thanks to the Problem Dampener, =4= reports are actually safe!

Update your analysis by handling situations where the Problem Dampener can remove a single level from unsafe reports. How many reports are now safe?

So, this task is a good example on how easy it is to get lost in a overly complicated approach. My first approach was basically to somehow check if the general trend was descending or ascending and based on that to remove opposite differences. This was too complicated and in the end I found 4 less safe reports than were needed… (Also note the beautiful print debuggin…)

My actual working approach was in the end brute force. I just threw out each element in a row and checked if the report became positive - much simpler and effective!

isreport2 <- function(row) {
    rep <- na.omit(row)
    out <- isalltrue(rep)
    if (out == FALSE) {
        for (i in seq_along(rep)) {
            # rep <- rep[-i] does not update the variable
            # with rep.i, a new variable is assigned (rep.1, rep.2, ...)
            rep.i <- rep[-i]
            out <- isalltrue(rep.i)
            if (out == TRUE) {
                break
            }
        }
    }
    return(out)
}
sum(apply(data, 1, isreport2))

Advent of code 2024 - Tag 2: Rotnasige Berichte

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2024 in der auf Statistik fokussierten Programmiersprache R. Dies ist Tag 2 - siehe https:adventofcode.com/2024/day/2

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2024 - Day 3: Mull It Over

I try to solve this year’s Advent of Code 2024 riddles in R. This is Day 3 - see https:adventofcode.com/2024/day/3

Part 1: mul()

The computer appears to be trying to run a program, but its memory (your puzzle input) is corrupted. All of the instructions have been jumbled up!

It seems like the goal of the program is just to multiply some numbers. It does that with instructions like mul(X,Y), where X and Y are each 1-3 digit numbers. For instance, mul(44,46) multiplies 44 by 46 to get a result of 2024. Similarly, mul(123,4) would multiply 123 by 4.

However, because the program’s memory has been corrupted, there are also many invalid characters that should be ignored, even if they look like part of a mul instruction. Sequences like mul(4*, mul(6,9!, ?(12,34), or mul ( 2 , 4 ) do nothing.

For example, consider the following section of corrupted memory:

=xmul(2,4)%&mul[3,7]!@^do_not_mul(5,5)+mul(32,64]then(mul(11,8)mul(8,5)) Only the four highlighted sections are real mul instructions. Adding up the result of each instruction produces 161 (2*4 + 5*5 + 11*8 + 8*5).

Scan the corrupted memory for uncorrupted mul instructions. What do you get if you add up all of the results of the multiplications?

My input file: 2024-12-03-1-aoc.txt

First, let’s load the data and take a look:

library(tidyverse)

connection <- file("2024-12-03-1-aoc.txt", open = "r")
lines <- readLines(connection)
print(lines[1])
mullist <- stringr::str_extract_all(lines, "mul\\(\\d*,\\d*\\)")
mullist <- unlist(mullist, recursive = FALSE)
head(mullist)
df <- mullist |>
  data.frame() |>
  magrittr::set_colnames(c("muls")) |>
  tidyr::separate_wider_regex(
           cols = "muls",
           patterns = c(
             "mul\\(",
             mul1 = "\\d*",
             ",",
             mul2 = "\\d*"
           ),
           too_few = "align_start"
         ) |>
  mutate(
    mul1 = as.integer(mul1),
    mul2 = as.integer(mul2),
    result = mul1 * mul2
  )
head(df)
mul1mul2result
<int><int><int>
948148140304
5903218880
611372227292
835665555275
724851616124
18848290616
sum(df$result)

Part 2: do() and don’t()

There are two new instructions you’ll need to handle:

  • The do() instruction enables future mul instructions.
  • The don't() instruction disables future mul instructions.

Only the most recent do() or don't() instruction applies. At the beginning of the program, mul instructions are enabled.

For example:

xmul(2,4)&mul[3,7]!^don't()_mul(5,5)+mul(32,64](mul(11,8)undo()?mul(8,5))

This corrupted memory is similar to the example from before, but this time the mul(5,5) and mul(11,8) instructions are disabled because there is a don't() instruction before them. The other mul instructions function normally, including the one at the end that gets re-*enabled* by a do() instruction.

This time, the sum of the results is =48= (2*4 + 8*5).

Handle the new instructions; what do you get if you add up all of the results of just the enabled multiplications?

https://stackoverflow.com/questions/36056219/regex-match-first-occurrence-before-keyword

lines[[4]]
for(i in 1:6) {
  varname <- paste("dolistf", i, sep="")
  assign(varname, stringr::str_extract(
                             lines[[i]], "^(?:.*?)don't\\(\\)"))
  varname <- paste("dolistl", i, sep="")
  assign(varname, stringi::stri_extract_last_regex(
                             lines[[i]],
                             "(do\\(\\)).*?(^don't\\(\\))"))

}

## dolist7 <- stringr::str_extract_all(lines, "do\\(\\)(?:.*?)don't\\(\\)")
dolist7 <- stringr::str_extract_all(lines,
                                    "^(?:.*?)don't\\(\\)")
dolist8 <- stringr::str_extract_all(lines,
                                    "(?<!(don't\\(\\)))(do\\(\\)(?:.*?)don't\\(\\))")
dolist9 <- stringr::str_extract_all(lines,
                                    "do\\(\\)(?:.(?!(don't\\(\\))))+$")
dolist7 <- unlist(dolist7, recursive = FALSE)
dolist <- c(dolistf1, dolistf2, dolistf3, dolistf4, dolistf5, dolistf6,
            dolistl1, dolistl2, dolistl3, dolistl4, dolistl5, dolistl6,
            dolist7)
dolist8
mullist <- stringr::str_extract_all(dolist7, "mul\\(\\d*,\\d*\\)")
mullist <- unlist(mullist, recursive = FALSE)
length(mullist)
df <- mullist |>
  data.frame() |>
  magrittr::set_colnames(c("muls")) |>
  tidyr::separate_wider_regex(
           cols = "muls",
           patterns = c(
             "mul\\(",
             mul1 = "\\d*",
             ",",
             mul2 = "\\d*"
           ),
           too_few = "align_start"
         ) |>
  mutate(
    mul1 = as.integer(mul1),
    mul2 = as.integer(mul2),
    result = mul1 * mul2
  )
head(df)
mul1mul2result
<int><int><int>
948148140304
5903218880
611372227292
835665555275
416366152256
4594721573
sum(df$result)

Advent of code 2024 - Tag 3: …

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2024 in der auf Statistik fokussierten Programmiersprache R. Dies ist Tag 2 - siehe https:adventofcode.com/2024/day/2

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Yellow Wallpaper and a king in all things - Reads June 2024

  • Charlotte Perkins Gilman: The Yellow Wallpaper
    • A classic of feminist literature with a soft horror touch - a quick read (only 24 pages) and definitely worth it.
  • Herta Müller: Der König verneigt sich und tötet (essay collection)
    • Jede Sprache hat andere Augen
    • Der König verneigt sich und tötet

The Yellow Wallpaper und der König in den Dingen - Leseliste Juni 2024

Connection on Mount Misery vs Corona

Eine Grundüberlegung bei Noah Sham’s Mount Misery: It is not self that heals, but connection.

Corona: Connection (soziale Nähe) tötet.

Verbindung in Mount Misery vs Corona

Siehe Englische Version.

Thoughts on Thea Dorn about consolation in pandemic times

See German version.

Gedanken zu Thea Dorn über Trost und Religiösität in Zeiten der Pandemie

In einer kürzlich erschienenen Folge des ZEIT-Podcasts “Alles gesagt” besprechen Thea Dorn, Jochen Wegner, und Christian Amend das Buch Trost. Briefe an Max.. Sie nähern sich einem Kernproblem des modernen Menschen: wir seien darauf aus, entweder vor einem unerwünschten Ereignis Risiken zu minimieren, oder nachher Schadenersatz und Verantwortung zu fordern. Wir hätten verlernt, mit den Unwägbarkeiten des Lebens umzugehen. Religiösität mit ihrer trostspendenen Wirkung sei der entscheidende Ausweg.

Punkte, die ich ansprechen möchte:

  • Thea Dorn zitiert NIetzsche platt als “Gott ist tot, der Mensch ist Gott, etc” → eingehen auf Nietzsche’s “Gott ist tot” als Aufruf an Mensch in der “aufgeklärten” Welt, dass es eine neue Begründung für basale Dinge wie Menschenrechte, oder auch Trost, braucht. Glauben als mentaler Trick. “Gott ist tot” als Aufruf und mit Hoffnung besetzt, eine nachhaltigere, tiefere Basis zu finden.
  • Sie geht zurück zur Religiösität. Ich als ostdeutsch sozialisiert kann nicht zurückgehen. Ich wurde als Kind nicht an Glauben herangeführt, es bleibt für mich ein unzugängliches Thema, welches nicht als ehrlich und wahren Trost annehmen kann. Ich weiß unverrückbar, dass das Leben ein kurzer, zufälliger Moment ist, dass nach dem Tod nichts ist, dass das Leben am Ende keinen höheren Sinn hat. Dies ist mein Ausgangspunkt. Ist dies der Grund, warum ich bei diesen Überlegungen keine Hoffnungslosigkeit spüre? Ich muss nicht dabei verweilen, ich kann einen Schritt weitergehen: letztendlich zählt das, was ich in meinem jetzigen Leben tue.
  • Dass es Leben gibt, ist chemischer Zufall, dass es menschliches Leben gibt, ist evolutionärer Zufall, dass ich in einem reichen Land geboren werde, ist geschichtlicher Zufall, dass meine Eltern mich gut versorgen und großziehen können, ist familiärer Zufall. Brauchen wir eine Zufallsethik? Wenn so viele Teile unserer Existenz Zufall sind, ist das genug, um Trost zu spenden?
  • Trifft das Unheil ein, ist dies ein Schicksalsschlag, aber nicht auch eine statistische Größe? Das Medizinstudium zeigt mir: die ganze Zeit erkranken Menschen, während ich oft glimpflich davonkomme. Ich bin nicht mit 18J in einen tödlichen Verkehrsunfall verwickelt worden. Ich bin nicht mit 23J an akuter Leukämie gestorben. Sehe ich diese statistischen Bedrohungen die ganze Zeit als reale Bedrohungen, werde ich gewiss verrückt, verzweifeln, hoffnungslos.
  • Ich bin es jedoch nicht. Warum nicht? Gewiss aufgrund einer gewissen stoischen Haltung. Das reicht aber nicht, es braucht auch einen aktiven Part - Risikominimierung bietet letztlich dieses Gewissen: “ich habe mein möglichstes getan”. Christian Drosten hatte eine ähnliche Losung: wir müssen besorgt sein und Maßnahmen treffen - aber nicht panisch.

Weiteres Thema: Thea Dorn sagt, Naturwissenschaftler*innen wägen gegeneinander ab: wir rechnen aus, was wieviel Leid bringt, dann nehmen wir das, was weniger bringt. Das sei Utilitarismus. Sie jedoch sagt: wir müssen abwägen von Leben retten vs “Freiheit”.

  • mit Friederike Schmitz: Abwägen im juristischen Sinne IST Utilitarismus!

Reading “The Shadow of the Sun” by Ryszard Kapuściński

  • State “TODO” from “PENDING” [2021-01-13 Mi 23:24]
  • State “PENDING” from “DONE” [2021-01-13 Mi 23:24]

General impression: thoughtful book, mostly he tries to avoid single stories and gives an impression of different scenes in or between different African countries from the end of the 1950s till the 1990s. From time to time he tries to explain certain view or mentalities in African countries, which might come of as condescending from todays view and be of more interest to non-African readers than to African readers.

In this blog post I will share the more impressionist quotes I liked, while some historical takeaways will be collected in Notes

p 29: Course and temperature of the first greeting defines fate of the relationship

  • from very first second: exhibit enormous primal joy and geniality
  • extend hand in large, vigorous gesture
  • loud laughter, many questions

p 36ff.: colonial heritage

  • a state wherein the civil servant received renumeration beyond all measure and reason (white low burocratic suddenly gets villa, servants, …)
    • after independence this system gives fast rise to new elites
    • french: la politique du ventre (… of the stomach)
  • 10.000 kingodms, federations and stateless, but independent tribal associations crammed in ~40 colonies! (without asking)
  • ports - only leeches on the body of Africa, points of export for slaves, gold and ivory
  • p52: Why Indians built the railway
    • White worker from Europe → was master, could not do physical labour
    • African worker → “did not exist”, the concept of wages was missing and British had system of forced labour later
  • p82: Islands around Africa
    • Were bases for sailors, merchants, and robbers (especially Europeans)
    • For unstable African boats hard to reach → spot for concentration camps for slave trade
  • p83: The philosophy that inspired the construction of Kolyma and Auschwitz, one of obsessive content and hatred, vileness and brutality, was formulated and set down centuries earlier by the captains of the Martha and the Progress, the Marie Ann and the Rainbow

p60: Secrets of Serengeti

  • lions attacking humans → are old outcasts
  • where are the elephant cemetries → on bottom of lakes

Europeans, Africans and Racism

  • p70: Poland vs Tansania
    • children asking Ryszard in Poland “And did you see many cannibals?”
    • Mothers in Tanganyika to their children: “You had better be good, or else the mzungu (Swahili: the white man, the European) will eat you!”
  • p110: So often I had felt irritated with people who arrived here, lived in “little Europe” or “little America” (e.g. in luxury hotels), and departed, bragging later that they had been to Africa, a place that in reality they had never seen.

wonders

  • Lalibela: 11 great churches carved in stone and misery