@aseltmann

Homepage

Hi, I’m Alva

I am a researcher and physician, currently working with Marie von Lilienfeld-Toal at the Institute for Diversity Medicine at Ruhr University Bochum, formerly at the Biophysical Imaging Lab of Christian Eggeling in Jena.

Contact

You can get in contact with me by sending an email to alva.seltmann (at) rub (dot) de. You can also find me on Github.

Homepage_de

Hallo, ich bin Alva

Ich bin Wissenschaftlerin und Ärztin, aktuell tätig bei Marie von Lilienfeld-Toal am Institut für Diversitätsmedizin der Ruhr-Universität Bochum, davor in der Biophysical Imaging Gruppe von Christian Eggeling in Jena.

Kontakt

Ich bin per Email erreichbar unter alva.seltmann (at) rub (dot) de. Weiterhin bin ich zu finden auf Github.

Publications

Publikationen

Notes

Poetry

Lǎozǐ

I am partial to the translation of Derek Lin

Aphorisms

Test page

Year numbering system

Examples:

Human Era year	Common Era year	Event
<r>	<r>
1001 HE	9000 BCE	Jericho
7301 HE	2700 BCE	First pyramid
11460 HE	1460 CE	Machu Picchu built
11945 HE	1945 CE	United Nations founded

Wonderful illustration of the case for the Human Era (courtesy of [kurzgesagt.org](https://kurzgesagt.org)):

Structured arguments and thoughts, often presented in lists.

Bold, italic, verbatim, ~~strikethrough~~

This is a test. And Hugo Book Shortcodes in an org source file:

Buttons

Columns

Expand

Here are the details.

Hints

Display Mode (KaTeX)

This should also be possible in plain org-mode: π(x) vs $π(x)$ vs $π(x)$

This works, but requires MathJax:

\begin{equation} \label{eq:1} C = Wlog₂ (1+\mathrm{SNR}) \end{equation}

x = \begin{cases} a &\text{if } b
c &\text{if } d \end{cases}

Generate SVG charts and diagrams for text (Mermaid)

Tabs

Science & Medicine

Critical test section

test section

Test section text.

Other critical stuff

Here is other critical stuff.

International

Africa

Notes on specific countries

Phases after WWII

1st phase: 50s/60s - rapid decolonization, optimism, euphoria

promise and hope, with freedom comes prosperity
movement of French colonies → want to become French citizens
movement of British colonies → Pan-Africanism
- people: activist Alexander Crummell, writer W. E. B. Du Bois (core idea: Black* people should remain in the countries where they now live), journalist Marcus Garvey (core idea: Black* people should return to Africa)
- core ideas
  1. all Blacks* in the world constitute a single race, a single culture, and should be proud of their colour of skin
  2. all of Africa should be independent and united → “Africa for Africans”
BUT: corrupt elites - why?
- no well-developed private sector, plantations belonged to foreigners, banks belong to foreign capital → political career was only way to richness

2nd phase: 70s/80s - rapid growth of population, pessimism

civil wars, revolts, massacres, hunger
- opponents used all means (tribal and ethnic conflicts, military might, corruption, murder)
- cold war era → problems + interests of weaker, dependent countries were ignored, subordinate to superpower interests
coup d’etat in Nigeria - military seizes power after only 5 years of independence
- and people celebrate! Since own elite is seen as too corrupt (“black imperialists”, “political wolves, who plunder the country”)

Asia

Another continent to describe.

Section Philosophy

Philosophy

A overview from

Notizen

Gedichte

Erich Fried

Lǎozǐ

Ich persönlich mag die Übersetzung von Günther Debon.

Aphorismen

Sektion Philosopie

Philosopie

Ein Überblick aus “Sophies Welt”

Blog Posts

Test post

Test post text.

Test post deutsch

Test post text.

Advent of code 2023 - Day 1: Trebuchet?!

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 1 - see https:adventofcode.com/2023/day/1

Part 1

The newly-improved calibration document consists of lines of text; each line originally contained a specific calibration value that the Elves now need to recover. On each line, the calibration value can be found by combining the first digit and the last digit (in that order) to form a single two-digit number.

For example:

1abc2
pqr3stu8vwx
a1b2c3d4e5f
treb7uchet

In this example, the calibration values of these four lines are 12, 38, 15, and 77. Adding these together produces 142.

Consider your entire calibration document. What is the sum of all of the calibration values?

My input file: 2023-12-01-1-aoc.txt

Lets start jupyter in our shell to start coding!

conda activate tf
jupyter lab --no-browser --port=8888

First, load the test document

import pandas as pd
import re

txt = pd.read_table('data/2023-12-01-1-aoc.txt', names=['code'])
txt

	code
0	jjfvnnlfivejj1
1	6fourfour
2	ninevbmltwo69
3	pcg91vqrfpxxzzzoneightzt
4	jpprthxgjfive3one1qckhrptpqdc
…	…
995	583sevenhjxlqzjgbzxhkcl5
996	81s
997	2four3threesxxvlfqfive4
998	nine6eightsevenzx9twoxc
999	hmbfjdfnp989mfivefiverpzrjs

1000 rows × 1 columns

Second, extract the digits. I had to wrap my head around regex matching in python first, because I first tried pandas.extract (which only extracts the first match), then pandas.extractall (which extracts all matches but puts them into a multiindex which makes things more difficult in this case). So I settled for the re.findall version, which returns a list. To concatenate the elements in the list, we take use the join function.

txt['digits'] = txt.loc[:, 'code'].apply(
    lambda x: ''.join(re.findall(r'(\d+)', x)))
txt

	code	digits
0	jjfvnnlfivejj1	1
1	6fourfour	6
2	ninevbmltwo69	69
3	pcg91vqrfpxxzzzoneightzt	91
4	jpprthxgjfive3one1qckhrptpqdc	31
…	…	…
995	583sevenhjxlqzjgbzxhkcl5	5835
996	81s	81
997	2four3threesxxvlfqfive4	234
998	nine6eightsevenzx9twoxc	69
999	hmbfjdfnp989mfivefiverpzrjs	989

1000 rows × 2 columns

Next, combine the first and the last digit and convert the result from string to integer

txt['calibration'] = txt.loc[:, 'digits'].apply(
    lambda x: int(x[0] + x[-1]))
txt

	code	digits	calibration
0	jjfvnnlfivejj1	1	11
1	6fourfour	6	66
2	ninevbmltwo69	69	69
3	pcg91vqrfpxxzzzoneightzt	91	91
4	jpprthxgjfive3one1qckhrptpqdc	31	31
…	…	…	…
995	583sevenhjxlqzjgbzxhkcl5	5835	55
996	81s	81	81
997	2four3threesxxvlfqfive4	234	24
998	nine6eightsevenzx9twoxc	69	69
999	hmbfjdfnp989mfivefiverpzrjs	989	99

1000 rows × 3 columns

Lastly, get the sum of our calibration numbers

txt.loc[:, 'calibration'].sum()

Part 2

Your calculation isn’t quite right. It looks like some of the digits are actually spelled out with letters: one, two, three, four, five, six, seven, eight, and nine also count as valid “digits”.

Equipped with this new information, you now need to find the real first and last digit on each line. For example:

two1nine
eightwothree
abcone2threexyz
xtwone3four
4nineeightseven2
zoneight234
7pqrstsixteen

In this example, the calibration values are 29, 83, 13, 24, 42, 14, and 76. Adding these together produces 281.

What is the sum of all of the calibration values?

Okay, let’s see if we can update the pattern matching. To deal with potential overlapping values like oneight which contains one as well as eight, I used the regex positive lookahead ?= as described here. Because this enables capturing overlapping values, I used \d (one digit) instead of \d+ (one or more digits), otherwise digits might double. Afterwards, just replace the spelled out digits with their numerical value.

# for i, r in enumerate(txt.loc[:, 'code']):
#     matches = re.findall(
#         r'(?=(\d|one|two|three|four|five|six|seven|eight|nine))', r)
#     result = ''.join([match for match in matches])
#     result = result.replace('one', '1').replace('two', '2').replace(
#         'three', '3').replace('four', '4').replace('five', '5').replace(
#         'six', '6').replace('seven', '7').replace('eight', '8').replace(
#         'nine', '9')
#     txt.loc[i, 'digits2'] = result
# txt

# a very nice alternative suggested by Tomalak:
digits = '\d one two three four five six seven eight nine'.split()


txt['digits2'] = txt.loc[:, 'code'].apply(lambda v: ''.join(
    str(digits.index(m)) if m in digits else m
    for m in re.findall(rf'(?=({"|".join(digits)}))', v)
))
txt

	code	digits	calibration	digits2
0	jjfvnnlfivejj1	1	11	51
1	6fourfour	6	66	644
2	ninevbmltwo69	69	69	9269
3	pcg91vqrfpxxzzzoneightzt	91	91	9118
4	jpprthxgjfive3one1qckhrptpqdc	31	31	5311
…	…	…	…	…
995	583sevenhjxlqzjgbzxhkcl5	5835	55	58375
996	81s	81	81	81
997	2four3threesxxvlfqfive4	234	24	243354
998	nine6eightsevenzx9twoxc	69	69	968792
999	hmbfjdfnp989mfivefiverpzrjs	989	99	98955

1000 rows × 4 columns

Now, construct the calibration value as before…

txt['calibration2'] = txt.loc[:, 'digits2'].apply(lambda x: int(x[0] + x[-1]))
txt

	code	digits	calibration	digits2	calibration2
0	jjfvnnlfivejj1	1	11	51	51
1	6fourfour	6	66	644	64
2	ninevbmltwo69	69	69	9269	99
3	pcg91vqrfpxxzzzoneightzt	91	91	9118	98
4	jpprthxgjfive3one1qckhrptpqdc	31	31	5311	51
…	…	…	…	…	…
995	583sevenhjxlqzjgbzxhkcl5	5835	55	58375	55
996	81s	81	81	81	81
997	2four3threesxxvlfqfive4	234	24	243354	24
998	nine6eightsevenzx9twoxc	69	69	968792	92
999	hmbfjdfnp989mfivefiverpzrjs	989	99	98955	95

1000 rows × 5 columns

… and get the correct sum!

txt.loc[:, 'calibration2'].sum()

Advent of code 2023 - Tag 1: Trebuchet?!

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 1 - siehe https:adventofcode.com/2023/day/1

Advent of code 2023 - Day 2: Cube Conundrum

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 2 - see https:adventofcode.com/2023/day/2

Part 1

As you walk, the Elf shows you a small bag and some cubes which are either red, green, or blue. Each time you play this game, he will hide a secret number of cubes of each color in the bag, and your goal is to figure out information about the number of cubes.

To get information, once a bag has been loaded with cubes, the Elf will reach into the bag, grab a handful of random cubes, show them to you, and then put them back in the bag. He’ll do this a few times per game.

You play several games and record the information from each game (your puzzle input). Each game is listed with its ID number (like the 11 in Game 11: ...) followed by a semicolon-separated list of subsets of cubes that were revealed from the bag (like 3 red, 5 green, 4 blue).

For example, the record of a few games might look like this:

Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green
Game 2: 1 blue, 2 green; 3 green, 4 blue, 1 red; 1 green, 1 blue
Game 3: 8 green, 6 blue, 20 red; 5 blue, 4 red, 13 green; 5 green, 1 red
Game 4: 1 green, 3 red, 6 blue; 3 green, 6 red; 3 green, 15 blue, 14 red
Game 5: 6 red, 1 blue, 3 green; 2 blue, 1 red, 2 green

In game 1, three sets of cubes are revealed from the bag (and then put back again). The first set is 3 blue cubes and 4 red cubes; the second set is 1 red cube, 2 green cubes, and 6 blue cubes; the third set is only 2 green cubes.

The Elf would first like to know which games would have been possible if the bag contained only 12 red cubes, 13 green cubes, and 14 blue cubes?

In the example above, games 1, 2, and 5 would have been possible if the bag had been loaded with that configuration. However, game 3 would have been impossible because at one point the Elf showed you 20 red cubes at once; similarly, game 4 would also have been impossible because the Elf showed you 15 blue cubes at once. If you add up the IDs of the games that would have been possible, you get 8.

Determine which games would have been possible if the bag had been loaded with only 12 red cubes, 13 green cubes, and 14 blue cubes. What is the sum of the IDs of those games?

My input file: 2023-12-02-1-aoc.txt

Okay, let’s load our python kernel in emacs-jupyter and get coding! First of all, let’s load the input and split the riddle code by colon : to extract the game id and the rest of the code by semicolon ; to get the number of sets played in each game.

import pandas as pd
import re

txt = pd.read_table('data/2023-12-02-1-aoc.txt', names=['code'])
txt['id'] = txt.loc[:, 'code'].str.split(':').apply(
    lambda x: int(x[0].strip('Game ')))
txt['code'] = txt.loc[:, 'code'].str.split(':').apply(lambda x: x[1])
# txt['code'] = txt.loc[:, 'code'].str.split(';')
# txt['nsets'] = txt.loc[:, 'code'].apply(lambda x: len(x))
txt

	code	id
0	1 green, 1 blue, 1 red; 1 green, 8 red, 7 blu…	1
1	9 red, 7 green, 3 blue; 15 green, 2 blue, 5 r…	2
2	3 red, 1 blue, 4 green; 6 red, 3 green, 2 blu…	3
3	2 blue, 2 green, 19 red; 3 blue, 11 red, 16 g…	4
4	8 green, 1 red, 12 blue; 10 green, 6 red, 13 …	5
…	…	…
95	2 red, 2 green, 1 blue; 1 red, 4 green; 1 green	96
96	4 red, 5 green; 5 blue, 3 red; 8 blue, 2 gree…	97
97	1 blue; 2 green, 1 red; 5 red, 2 green; 4 red…	98
98	6 blue, 5 red, 2 green; 9 red, 1 blue; 2 gree…	99
99	1 blue, 13 green, 14 red; 11 green, 11 blue, …	100

100 rows × 2 columns

Now, let’s extract the three colors in different columns with regex. We use the lookahead assertion ?= to find the respective colours and only exctract the digits \d+ coming before. Then we just keep the max imum drawn number of cubes per color, since this is the only information that matters at the moment.

txt['green'] = txt.loc[:, 'code'].apply(
    lambda code: re.findall(r'\d+(?=.green)', code)).apply(
        lambda list: max([int(i) for i in list]))
txt['red'] = txt.loc[:, 'code'].apply(
    lambda code: re.findall(r'\d+(?=.red)', code)).apply(
        lambda list: max([int(i) for i in list]))
txt['blue'] = txt.loc[:, 'code'].apply(
    lambda code: re.findall(r'\d+(?=.blue)', code)).apply(
        lambda list: max([int(i) for i in list]))
txt

	code	id	green	red	blue
0	1 green, 1 blue, 1 red; 1 green, 8 red, 7 blu…	1	2	10	10
1	9 red, 7 green, 3 blue; 15 green, 2 blue, 5 r…	2	15	10	3
2	3 red, 1 blue, 4 green; 6 red, 3 green, 2 blu…	3	4	6	16
3	2 blue, 2 green, 19 red; 3 blue, 11 red, 16 g…	4	16	20	18
4	8 green, 1 red, 12 blue; 10 green, 6 red, 13 …	5	10	6	14
…	…	…	…	…	…
95	2 red, 2 green, 1 blue; 1 red, 4 green; 1 green	96	4	2	1
96	4 red, 5 green; 5 blue, 3 red; 8 blue, 2 gree…	97	5	4	8
97	1 blue; 2 green, 1 red; 5 red, 2 green; 4 red…	98	2	5	2
98	6 blue, 5 red, 2 green; 9 red, 1 blue; 2 gree…	99	2	9	11
99	1 blue, 13 green, 14 red; 11 green, 11 blue, …	100	13	15	11

100 rows × 5 columns

Lastly, we just filter the DataFrame to only include games where all drawn cubes were below or equal the number of cubes in the game and sum the result!

txt['id'][(txt['green'] < 14) & (txt['red'] < 13) & (txt['blue'] < 15)].sum()

Part 2

As you continue your walk, the Elf poses a second question: in each game you played, what is the fewest number of cubes of each color that could have been in the bag to make the game possible?

Again consider the example games from earlier:

Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green
Game 2: 1 blue, 2 green; 3 green, 4 blue, 1 red; 1 green, 1 blue
Game 3: 8 green, 6 blue, 20 red; 5 blue, 4 red, 13 green; 5 green, 1 red
Game 4: 1 green, 3 red, 6 blue; 3 green, 6 red; 3 green, 15 blue, 14 red
Game 5: 6 red, 1 blue, 3 green; 2 blue, 1 red, 2 green

In game 1, the game could have been played with as few as 4 red, 2 green, and 6 blue cubes. If any color had even one fewer cube, the game would have been impossible.

Game 2 could have been played with a minimum of 1 red, 3 green, and 4 blue cubes.

Game 3 must have been played with at least 20 red, 13 green, and 6 blue cubes.

Game 4 required at least 14 red, 3 green, and 15 blue cubes.

Game 5 needed no fewer than 6 red, 3 green, and 2 blue cubes in the bag.

The power of a set of cubes is equal to the numbers of red, green, and blue cubes multiplied together. The power of the minimum set of cubes in game 1 is 48. In games 2-5 it was 12, 1560, 630, and 36, respectively. Adding up these five powers produces the sum 2286.

For each game, find the minimum set of cubes that must have been present. What is the sum of the power of these sets?

Luckily, this task is made trivial by the approach we have taken before. We just have to multiply the green, red and blue columns:

txt['power'] = txt.loc[:, 'green'] * txt.loc[:, 'blue'] * txt.loc[:, 'red']
txt

	code	id	green	red	blue	power
0	1 green, 1 blue, 1 red; 1 green, 8 red, 7 blu…	1	2	10	10	200
1	9 red, 7 green, 3 blue; 15 green, 2 blue, 5 r…	2	15	10	3	450
2	3 red, 1 blue, 4 green; 6 red, 3 green, 2 blu…	3	4	6	16	384
3	2 blue, 2 green, 19 red; 3 blue, 11 red, 16 g…	4	16	20	18	5760
4	8 green, 1 red, 12 blue; 10 green, 6 red, 13 …	5	10	6	14	840
…	…	…	…	…	…	…
95	2 red, 2 green, 1 blue; 1 red, 4 green; 1 green	96	4	2	1	8
96	4 red, 5 green; 5 blue, 3 red; 8 blue, 2 gree…	97	5	4	8	160
97	1 blue; 2 green, 1 red; 5 red, 2 green; 4 red…	98	2	5	2	20
98	6 blue, 5 red, 2 green; 9 red, 1 blue; 2 gree…	99	2	9	11	198
99	1 blue, 13 green, 14 red; 11 green, 11 blue, …	100	13	15	11	2145

100 rows × 6 columns

And for this one, the sum is:

txt['power'].sum()

Advent of code 2023 - Tag 2: Würfelrätsel

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 2 - siehe https:adventofcode.com/2023/day/2

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2023 - Day 3: Gear Ratios

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 3 - see https:adventofcode.com/2023/day/3

Part 1

The engine schematic (your puzzle input) consists of a visual representation of the engine. There are lots of numbers and symbols you don’t really understand, but apparently any number adjacent to a symbol, even diagonally, is a “part number” and should be included in your sum. (Periods (.) do not count as a symbol.)

Here is an example engine schematic:

467..114..
...*......
..35..633.
......#...
617*......
.....+.58.
..592.....
......755.
...$.*....
.664.598..

In this schematic, two numbers are not part numbers because they are not adjacent to a symbol: 114 (top right) and 58 (middle right). Every other number is adjacent to a symbol and so is a part number; their sum is 4361.

Of course, the actual engine schematic is much larger. What is the sum of all of the part numbers in the engine schematic?

My input file: 2023-12-03-1-aoc.txt

Okay, let’s first get the input as a numpy character array

import numpy as np
import pandas as pd
import sys
import matplotlib.pyplot as plt
from scipy import ndimage as ndi
np.set_printoptions(threshold=sys.maxsize)

txt = pd.read_table('data/2023-12-03-1-aoc.txt', names=['code'])
arr = np.chararray((txt.size, txt.size), unicode=True)

txt['code'] = txt.loc[:, 'code'].apply(lambda x: [i for i in x])

for i, code in enumerate(txt['code']):
    arr[i, :] = code

print((arr[:15, :15]))

Now extract symbols, digits and the empty space. We use the numpy character methods for that. This way, we create a binary mask for all digits and a binary mask for all empty space (.). The symbols are then every character which is neither.

digits = np.char.isdigit(arr)
empty = np.char.endswith(arr, '.')
symbols = ~(digits | empty)

# just for visualization
plt.figure(figsize=(16, 5))
plt.subplot(131, title='symbols').matshow(symbols)
plt.subplot(132, title='digits').matshow(digits)
plt.subplot(133, title='empty').matshow(empty)
plt.show()

Now we use the image processing technique of dilation on the symbols mask. So that we get a new mask which covers the surroundings of all symbols. We use this afterwards to check if the digits are near a symbol.

struct = ((1, 1, 1), (1, 1, 1), (1, 1, 1))
dilate = ndi.binary_dilation(symbols, structure=struct)
plt.figure(figsize=(10, 6))
plt.subplot(121, title='symbols').matshow(symbols[:15, :15])
plt.subplot(122, title='symbols dilated').matshow(dilate[:15, :15])
plt.show()

Creating this masks as before could be understood as a binary segmentation, as each element in our mask is either True or False. To extract the single digits, we’ll convert the binary digits segmentation into a instance segmentation, where each connected segment has an own index.

markers, num_features = ndi.label(digits)
plt.figure(figsize=(10, 6))
plt.subplot(121, title='binary segmentation').matshow(
    digits[:15, :15])
plt.subplot(122, title='instance segmentation').matshow(
    markers[:15, :15], cmap='gnuplot')
plt.show()

Now, for each instance, we check if the dilated binary mask overlaps with the instance and if yes, we extract the number.

numbers = [int(''.join(arr[markers == i]))
           for i in range(1, num_features+1)
           if np.any((markers == i) & dilate)]

Then, we just sum up:

sum(numbers)

Part 2

The missing part wasn’t the only issue - one of the gears in the engine is wrong. A gear is any * symbol that is adjacent to exactly two part numbers. Its gear ratio is the result of multiplying those two numbers together.

This time, you need to find the gear ratio of every gear and add them all up so that the engineer can figure out which gear needs to be replaced.

Consider the same engine schematic again:The missing part wasn’t the only issue - one of the gears in the engine is wrong. A gear is any * symbol that is adjacent to exactly two part numbers. Its gear ratio is the result of multiplying those two numbers together.

This time, you need to find the gear ratio of every gear and add them all up so that the engineer can figure out which gear needs to be replaced.

Consider the same engine schematic again:

467..114..
...*......
..35..633.
......#...
617*......
.....+.58.
..592.....
......755.
...$.*....
.664.598..

In this schematic, there are two gears. The first is in the top left; it has part numbers 467 and 35, so its gear ratio is 16345. The second gear is in the lower right; its gear ratio is 451490. (The * adjacent to 617 is not a gear because it is only adjacent to one part number.) Adding up all of the gear ratios produces 467835.

What is the sum of all of the gear ratios in your engine schematic?

We’ll use the same method as before, but this time only extract * and create the instance segmentation before the dilation. Why? Because when we dilate first, we could merge two independent gears into one instance.

gear = np.char.endswith(arr, '*')
gear_markers, gear_num = ndi.label(gear)

plt.figure(figsize=(10, 6))
plt.subplot(131, title='all symbols dilated').matshow(
    symbols[:15, :15])
plt.subplot(132, title='gears').matshow(
    gear[:15, :15])
plt.subplot(133, title='gears instances').matshow(
    gear_markers[:15, :15], cmap='gnuplot')
plt.show()

Now, we step through each gear instance, create a mask for that gear, dilate it, and then step through all digits and check if they are in the gear mask. If we get two digits in the end, we multiply them to get the gear ratio and save the ratios to a list.

gear_ratios = []
for i in range(1, gear_num+1):
    gear_binary = gear_markers == i
    gear_dil = ndi.binary_dilation(gear_binary, structure=struct)
    part_numbers = [int(''.join(arr[markers == j]))
                    for j in range(1, num_features+1)
                    if np.any((markers == j) & gear_dil)]
    if len(part_numbers) == 2:
        gear_ratios.append(part_numbers[0] * part_numbers[1])

Now, we just sum the output again:

sum(gear_ratios)

Advent of code 2023 - Tag 3: Getriebeübersetzungen

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 3 - siehe https:adventofcode.com/2023/day/3

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2023 - Day 4: Scratchcards

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 4 - see https:adventofcode.com/2023/day/4

Part 1

The Elf leads you over to the pile of colorful cards. There, you discover dozens of scratchcards, all with their opaque covering already scratched off. Picking one up, it looks like each card has two lists of numbers separated by a vertical bar (|): a list of winning numbers and then a list of numbers you have. You organize the information into a table (your puzzle input).

As far as the Elf has been able to figure out, you have to figure out which of the numbers you have appear in the list of winning numbers. The first match makes the card worth one point and each match after the first doubles the point value of that card.

For example:

Card 1: 41 48 83 86 17 | 83 86  6 31 17  9 48 53
Card 2: 13 32 20 16 61 | 61 30 68 82 17 32 24 19
Card 3:  1 21 53 59 44 | 69 82 63 72 16 21 14  1
Card 4: 41 92 73 84 69 | 59 84 76 51 58  5 54 83
Card 5: 87 83 26 28 32 | 88 30 70 12 93 22 82 36
Card 6: 31 18 13 56 72 | 74 77 10 23 35 67 36 11

In the above example, card 1 has five winning numbers (41, 48, 83, 86, and 17) and eight numbers you have (83, 86, 6, 31, 17, 9, 48, and 53). Of the numbers you have, four of them (48, 83, 17, and 86) are winning numbers! That means card 1 is worth 8 points (1 for the first match, then doubled three times for each of the three matches after the first).

Card 2 has two winning numbers (32 and 61), so it is worth 2 points.

Card 3 has two winning numbers (1 and 21), so it is worth 2 points.

Card 4 has one winning number (84), so it is worth 1 point.

Card 5 has no winning numbers, so it is worth no points.

Card 6 has no winning numbers, so it is worth no points.

So, in this example, the Elf’s pile of scratchcards is worth 13 points.

Take a seat in the large pile of colorful cards. How many points are they worth in total?

My input file: 2023-12-04-1-aoc.txt

Loading this data is very similar to Day 2 - so let’s load the data as we did there. Our goal is to get win and yours columns holding the respective digits which we want to compare. We find the numbers with one or more digits using the regex \d+. And we want them to be in sets (not lists), as we can logically compare sets in Python.

import pandas as pd
import re

txt = pd.read_table('data/2023-12-04-1-aoc.txt', names=['win'])
txt['id'] = txt.loc[:, 'win'].str.split(':').apply(
    lambda x: int(x[0].strip('Card ')))
txt['win'] = (txt.loc[:, 'win']
              .str.split(':').apply(lambda x: x[1]))
txt['yours'] = (txt.loc[:, 'win']
                .str.split('|')
                .apply(lambda x: x[1])
                # get a list of only the numbers / digits
                .apply(lambda x: re.findall(r'\d+', x))
                # convert the list of strings to a set of integers
                .apply(lambda x: set([int(i) for i in x])))
txt['win'] = (txt.loc[:, 'win']
              .str.split('|')
              .apply(lambda x: x[0])
              .apply(lambda x: re.findall(r'\d+', x))
              .apply(lambda x: set([int(i) for i in x])))

txt

	win	id	yours
0	{32, 36, 7, 9, 10, 12, 82, 85, 95, 31}	1	{2, 7, 9, 10, 12, 14, 21, 22, 23, 24, 31, 32, …
1	{35, 76, 16, 82, 19, 22, 88, 59, 60, 95}	2	{7, 8, 12, 16, 19, 22, 26, 28, 35, 38, 44, 51,…
2	{1, 70, 11, 78, 48, 19, 52, 88, 28, 94}	3	{3, 4, 8, 17, 18, 19, 24, 31, 34, 45, 52, 54, …
3	{65, 2, 72, 28, 14, 16, 55, 91, 92, 62}	4	{3, 4, 6, 7, 8, 9, 15, 30, 33, 35, 47, 49, 51,…
4	{38, 41, 75, 77, 50, 24, 94, 60, 61, 30}	5	{1, 2, 4, 5, 6, 7, 9, 10, 14, 17, 21, 29, 47, …
…	…	…	…
213	{97, 98, 39, 41, 43, 12, 13, 19, 93, 95}	214	{5, 10, 17, 20, 28, 29, 33, 34, 36, 50, 51, 52…
214	{97, 35, 69, 40, 74, 45, 20, 21, 62, 31}	215	{1, 8, 15, 17, 18, 25, 30, 33, 42, 44, 47, 52,…
215	{33, 70, 71, 12, 78, 17, 51, 86, 60, 94}	216	{7, 8, 9, 10, 22, 29, 37, 39, 41, 43, 46, 47, …
216	{98, 67, 68, 38, 70, 39, 72, 77, 45, 21}	217	{8, 21, 22, 25, 26, 31, 37, 41, 42, 48, 54, 57…
217	{34, 9, 44, 78, 79, 16, 17, 19, 55, 92}	218	{1, 4, 20, 21, 27, 38, 39, 40, 41, 45, 46, 52,…

218 rows × 3 columns

Now, we get the logical conjunction of win and yours, these are our winning numbers. Then, the number of wins is converted to points - for all number of wins bigger than 1, we can get the points by 2**(n_wins-1).

txt['n_wins'] = txt.apply(
    lambda row: len(row.loc['win'] & row.loc['yours']), axis=1)
txt['points'] = txt.loc[:, 'n_wins'].apply(
    lambda x: 2**(x-1) if x > 1 else x)

txt.loc[:, ['win', 'n_wins', 'points']]

	win	n_wins	points
0	{32, 36, 7, 9, 10, 12, 82, 85, 95, 31}	10	512
1	{35, 76, 16, 82, 19, 22, 88, 59, 60, 95}	10	512
2	{1, 70, 11, 78, 48, 19, 52, 88, 28, 94}	5	16
3	{65, 2, 72, 28, 14, 16, 55, 91, 92, 62}	0	0
4	{38, 41, 75, 77, 50, 24, 94, 60, 61, 30}	0	0
…	…	…	…
213	{97, 98, 39, 41, 43, 12, 13, 19, 93, 95}	0	0
214	{97, 35, 69, 40, 74, 45, 20, 21, 62, 31}	0	0
215	{33, 70, 71, 12, 78, 17, 51, 86, 60, 94}	2	2
216	{98, 67, 68, 38, 70, 39, 72, 77, 45, 21}	1	1
217	{34, 9, 44, 78, 79, 16, 17, 19, 55, 92}	0	0

218 rows × 3 columns

Then, we just sum up:

sum(txt.loc[:, 'points'])

Part 2

There’s no such thing as “points”. Instead, scratchcards only cause you to win more scratchcards equal to the number of winning numbers you have.

Specifically, you win copies of the scratchcards below the winning card equal to the number of matches. So, if card 10 were to have 5 matching numbers, you would win one copy each of cards 11, 12, 13, 14, and 15.

Copies of scratchcards are scored like normal scratchcards and have the same card number as the card they copied. So, if you win a copy of card 10 and it has 5 matching numbers, it would then win a copy of the same cards that the original card 10 won: cards 11, 12, 13, 14, and 15. This process repeats until none of the copies cause you to win any more cards. (Cards will never make you copy a card past the end of the table.)

This time, the above example goes differently:

Card 1: 41 48 83 86 17 | 83 86  6 31 17  9 48 53
Card 2: 13 32 20 16 61 | 61 30 68 82 17 32 24 19
Card 3:  1 21 53 59 44 | 69 82 63 72 16 21 14  1
Card 4: 41 92 73 84 69 | 59 84 76 51 58  5 54 83
Card 5: 87 83 26 28 32 | 88 30 70 12 93 22 82 36
Card 6: 31 18 13 56 72 | 74 77 10 23 35 67 36 11

Card 1 has four matching numbers, so you win one copy each of the next four cards: cards 2, 3, 4, and 5.

Your original card 2 has two matching numbers, so you win one copy each of cards 3 and 4.

Your copy of card 2 also wins one copy each of cards 3 and 4.

Your four instances of card 3 (one original and three copies) have two matching numbers, so you win four copies each of cards 4 and 5.

Your eight instances of card 4 (one original and seven copies) have one matching number, so you win eight copies of card 5.

Your fourteen instances of card 5 (one original and thirteen copies) have no matching numbers and win no more cards.

Once all of the originals and copies have been processed, you end up with 1 instance of card 1, 2 instances of card 2, 4 instances of card 3, 8 instances of card 4, 14 instances of card 5, and 1 instance of card 6. In total, this example pile of scratchcards causes you to ultimately have 30 scratchcards!

Process all of the original and copied scratchcards until no more scratchcards are won. Including the original set of scratchcards, how many total scratchcards do you end up with?

We will use the n_wins column we created before and go from there. We step through each Game. Each current game gets +1 card. Then, we step through the number of next games depending on our n_wins. Each of these gets added the card number of the current game.

txt['cards'] = 0

for i, nwin in enumerate(txt.loc[:, 'n_wins']):
    txt.loc[i, 'cards'] += 1
    for j in range(1, nwin+1):
        txt.loc[i+j, 'cards'] += txt.loc[i, 'cards']

txt.loc[:, ['n_wins', 'cards']]

	n_wins	cards
0	10	1
1	10	2
2	5	4
3	0	8
4	0	8
…	…	…
213	0	9608
214	0	8927
215	2	8927
216	1	12636
217	0	21564

218 rows × 2 columns

Now, we just sum the output again:

sum(txt['cards'])

Advent of code 2023 - Tag 4: Rubbelkarten

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 4 - siehe https:adventofcode.com/2023/day/4

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2023 - Day 5: If You Give A Seed A Fertilizer

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 5 - see https:adventofcode.com/2023/day/5

Update [2023-12-31 So]:

substitute get_generator() (own implementation) with range() (Python inbuilt)
improve grid search so that it goes through all location ranges, still starting with the lowest range

Part 1

The almanac (your puzzle input) lists all of the seeds that need to be planted. It also lists what type of soil to use with each kind of seed, what type of fertilizer to use with each kind of soil, what type of water to use with each kind of fertilizer, and so on. Every type of seed, soil, fertilizer and so on is identified with a number, but numbers are reused by each category - that is, soil 123 and fertilizer 123 aren’t necessarily related to each other.

For example:

seeds: 79 14 55 13

seed-to-soil map:
50 98 2
52 50 48

soil-to-fertilizer map:
0 15 37
37 52 2
39 0 15

fertilizer-to-water map:
49 53 8
0 11 42
42 0 7
57 7 4

water-to-light map:
88 18 7
18 25 70

light-to-temperature map:
45 77 23
81 45 19
68 64 13

temperature-to-humidity map:
0 69 1
1 0 69

humidity-to-location map:
60 56 37
56 93 4

The almanac starts by listing which seeds need to be planted: seeds 79, 14, 55, and 13.

The rest of the almanac contains a list of maps which describe how to convert numbers from a source category into numbers in a destination category. That is, the section that starts with seed-to-soil map: describes how to convert a seed number (the source) to a soil number (the destination). This lets the gardener and his team know which soil to use with which seeds, which water to use with which fertilizer, and so on.

Rather than list every source number and its corresponding destination number one by one, the maps describe entire ranges of numbers that can be converted. Each line within a map contains three numbers: the destination range start, the source range start, and the range length.

Consider again the example seed-to-soil map:

50 98 2 52 50 48

The first line has a destination range start of 50, a source range start of 98, and a range length of 2. This line means that the source range starts at 98 and contains two values: 98 and 99. The destination range is the same length, but it starts at 50, so its two values are 50 and 51. With this information, you know that seed number 98 corresponds to soil number 50 and that seed number 99 corresponds to soil number 51. o The second line means that the source range starts at 50 and contains 48 values: 50, 51, …, 96, 97. This corresponds to a destination range starting at 52 and also containing 48 values: 52, 53, …, 98, 99. So, seed number 53 corresponds to soil number 55.

Any source numbers that aren’t mapped correspond to the same destination number. So, seed number 10 corresponds to soil number 10.

So, the entire list of seed numbers and their corresponding soil numbers looks like this:

With this map, you can look up the soil number required for each initial seed number:

Seed number 79 corresponds to soil number 81.

Seed number 14 corresponds to soil number 14.

Seed number 55 corresponds to soil number 57.

Seed number 13 corresponds to soil number 13.

The gardener and his team want to get started as soon as possible, so they’d like to know the closest location that needs a seed. Using these maps, find the lowest location number that corresponds to any of the initial seeds. To do this, you’ll need to convert each seed number through other categories until you can find its corresponding location number. In this example, the corresponding types are:

Seed 79, soil 81, fertilizer 81, water 81, light 74, temperature 78, humidity 78, location 82.

Seed 14, soil 14, fertilizer 53, water 49, light 42, temperature 42, humidity 43, location 43.

Seed 55, soil 57, fertilizer 57, water 53, light 46, temperature 82, humidity 82, location 86.

Seed 13, soil 13, fertilizer 52, water 41, light 34, temperature 34, humidity 35, location 35.

So, the lowest location number in this example is 35.

What is the lowest location number that corresponds to any of the initial seed numbers?

My input file: 2023-12-05-1-aoc.txt

Wow, this task is a mouthful…

Let’s start slowly and load the data. Our input text document contains several maps, which are clearly separated and have a title (seed-to-soil map etc). So we can tell pandas where each map starts and give each map a dataframe. I got the values for the skiprows and nrows argument by looking at the input file and… counting :)

import pandas as pd
import sys

seeds = pd.read_table('data/2023-12-05-1-aoc.txt', nrows=1, sep=' ',
                      header=None, index_col=0)
seeds = seeds.values.flatten()

map_opt = {'filepath_or_buffer': 'data/2023-12-05-1-aoc.txt',
           'header': None, 'sep': ' ', 'dtype': 'Int64',
           'names': ['dest_start', 'src_start', 'range']}

seed_soil = pd.read_table(skiprows=3, nrows=23, **map_opt)
soil_fert = pd.read_table(skiprows=28, nrows=9, **map_opt)
fert_water = pd.read_table(skiprows=39, nrows=20, **map_opt)
water_light = pd.read_table(skiprows=61, nrows=40, **map_opt)
light_temp = pd.read_table(skiprows=103, nrows=36, **map_opt)
temp_humi = pd.read_table(skiprows=141, nrows=35, **map_opt)
humi_loc = pd.read_table(skiprows=178, nrows=26, **map_opt)

maps = (seed_soil, soil_fert, fert_water, water_light, light_temp,
        temp_humi, humi_loc)

print('seeds are just a numpy array:')
display(seeds)
print('The "humidity-to-location" map as an example:')
humi_loc

	dest_start	src_start	range
0	3490144003	1623866227	218040905
1	1709610578	1620839197	3027030
2	105449249	586389428	113279526
3	1899604338	2167886292	348178199
4	1712637608	2678624215	186966730
5	218728775	0	245776251
6	2472992580	923734334	143670388
7	2616662968	3670169885	15297294
8	2247782537	1395629154	225210043
9	0	480940179	105449249
10	4113852846	3959729057	159909096
11	3322784653	1067404722	167359350
12	923734334	4119638153	175329143
13	1534964496	2516064491	162559724
14	2631960262	3496028440	140849733
15	2862695906	1234764072	97988355
16	1697524220	3636878173	12086358
17	2985646048	1332752427	62876727
18	1099063477	2970241510	435901019
19	4009202281	2865590945	104650565
20	2960684261	2142924505	24961787
21	2772809995	3406142529	89885911
22	4273761942	3648964531	21205354
23	3708184908	1841907132	301017373
24	464505026	245776251	235163928
25	3048522775	3685467179	274261878

My first attempt was to actually construct ranges like in the example above, mapping out all possible sources and destinations. Python quickly informed me that even constructing one pandas.Series with int64 values mapping seeds to soil would cost 64GB memory - not the best solution.

So we take a different approach. For convenience, let’s add a src_end and a dest_end column to our maps:

for df in maps:
    df['src_end'] = df.loc[:, 'src_start'] + df.loc[:, 'range']
    df['dest_end'] = df.loc[:, 'dest_start'] + df.loc[:, 'range']

print('Again the "humidity-to-location" map as an example:')
humi_loc

	dest_start	src_start	range	src_end	dest_end
0	3490144003	1623866227	218040905	1841907132	3708184908
1	1709610578	1620839197	3027030	1623866227	1712637608
2	105449249	586389428	113279526	699668954	218728775
3	1899604338	2167886292	348178199	2516064491	2247782537
4	1712637608	2678624215	186966730	2865590945	1899604338
5	218728775	0	245776251	245776251	464505026
6	2472992580	923734334	143670388	1067404722	2616662968
7	2616662968	3670169885	15297294	3685467179	2631960262
8	2247782537	1395629154	225210043	1620839197	2472992580
9	0	480940179	105449249	586389428	105449249
10	4113852846	3959729057	159909096	4119638153	4273761942
11	3322784653	1067404722	167359350	1234764072	3490144003
12	923734334	4119638153	175329143	4294967296	1099063477
13	1534964496	2516064491	162559724	2678624215	1697524220
14	2631960262	3496028440	140849733	3636878173	2772809995
15	2862695906	1234764072	97988355	1332752427	2960684261
16	1697524220	3636878173	12086358	3648964531	1709610578
17	2985646048	1332752427	62876727	1395629154	3048522775
18	1099063477	2970241510	435901019	3406142529	1534964496
19	4009202281	2865590945	104650565	2970241510	4113852846
20	2960684261	2142924505	24961787	2167886292	2985646048
21	2772809995	3406142529	89885911	3496028440	2862695906
22	4273761942	3648964531	21205354	3670169885	4294967296
23	3708184908	1841907132	301017373	2142924505	4009202281
24	464505026	245776251	235163928	480940179	699668954
25	3048522775	3685467179	274261878	3959729057	3322784653

Now we actually compute the mapping. For each seed, we go through all mappings and in each mapping we go through each row. We find the row which contains the mapping and exctract the destination, which is the source for the next map until we reach the last map which gives us the locations.

Approach 1: df.itertuples() is a convenient way to step through a pandas.DataFrame in this example. It is faster than df.iterrows() and returns the row as a NamedTuple - nice!
Approach 2: I actually wondered if it would be faster to get all maps in one pd.DataFrame and then iterate through the mappings. To test this let’s construct a new DataFrame maps_df which contains all maps. Since the maps have different lengths it is important to cast the datatype to Int64, which is short for pd.Int64Dtype() and keeps values as integers, even if NA values are in the same column.
Approach 3: A third alternative I tested (not shown here) was to check if a value is in a Python range with the in operator as in: if 3 in range(5):... . This was way too slow.

# mapping version 1
def get_location(seed):
    current = seed
    for df in maps:
        current_map = [row
                       for row in df.itertuples()
                       if ((current > row.src_start)
                           and (current < row.src_end))]
        if len(current_map) == 0:
            pass
        elif len(current_map) == 1:
            current = (current_map[0].dest_start
                       + (current - current_map[0].src_start))
        else:
            raise ValueError('This should not happen!')
    return current

# mapping version 2 - around 10 times slower
# you need to rename the maps_df columns so that they have a unique id
# e.g. 'src_start_1', 'src_start_2' etc

# maps_df = pd.concat(maps, axis='columns')

# def get_dest(i, src):
#     return (maps_df[(src > maps_df.loc[:, f'src_start_{i}']) &
#                     (src < maps_df.loc[:, f'src_end_{i}'])]
#             .loc[:, f'dest_start_{i}']
#             .iloc[0])

# def get_location2(seed):
#     dest = seed
#     i = 1
#     while i < 7:
#         dest = get_dest(i, dest)
#         i += 1
#     return dest

%timeit get_location(seeds[0])

Now let’s get a list of locations:

locations = [get_location(s) for s in seeds]
locations

Lastly, just get the minimum of all location values.

min(locations)

Part 2

Everyone will starve if you only plant such a small number of seeds. Re-reading the almanac, it looks like the seeds: line actually describes ranges of seed numbers.

The values on the initial seeds: line come in pairs. Within each pair, the first value is the start of the range and the second value is the length of the range. So, in the first line of the example above:

seeds: 79 14 55 13

This line describes two ranges of seed numbers to be planted in the garden. The first range starts with seed number 79 and contains 14 values: 79, 80, …, 91, 92. The second range starts with seed number 55 and contains 13 values: 55, 56, …, 66, 67.

Now, rather than considering four seed numbers, you need to consider a total of 27 seed numbers.

In the above example, the lowest location number can be obtained from seed number 82, which corresponds to soil 84, fertilizer 84, water 84, light 77, temperature 45, humidity 46, and location 46. So, the lowest location number is 46.

Consider all of the initial seed numbers listed in the ranges on the first line of the almanac. What is the lowest location number that corresponds to any of the initial seed numbers?

Let’s first construct a dataframe containing the range of seeds:

seeds_df = pd.DataFrame({'start': seeds[::2],
                         'range': seeds[1::2],
                         'end': seeds[::2] + seeds[1::2]})
print(f'There are {sum(seeds_df.loc[:, "range"]):_} seeds in total')
seeds_df

	start	range	end
0	3169137700	271717609	3440855309
1	3522125441	23376095	3545501536
2	1233948799	811833837	2045782636
3	280549587	703867355	984416942
4	166086528	44766996	210853524
5	2326968141	69162222	2396130363
6	2698492851	14603069	2713095920
7	2755327667	348999531	3104327198
8	2600461189	92332846	2692794035
9	1054656969	169099767	1223756736

Now - I really needed some time to finally realize, that going through all seed values is really unfeasable. So how do we deal with this problem?

In the end we need the lowest location number - thus our approach is to take the humi_loc map, start with the lowest location number and go up and get the corresponding seed values. The location of the first seed value which is inside seeds_df is our solution.

So first we rebuild the get_location function to a get_seed function (we reverse the maps tuple with maps[::-1] and switch src and dest).

def get_seed(location):
    current = location
    for df in maps[::-1]:
        current_map = [row
                       for row in df.itertuples()
                       if ((current >= row.dest_start)
                           and (current < row.dest_end))]
        if len(current_map) == 0:
            pass
        elif len(current_map) == 1:
            current = (current_map[0].src_start
                       + (current - current_map[0].dest_start))
        else:
            raise ValueError('This should not happen!')
    return current

Since the Python 3 range function does not store it’s contents in memory (similar to a generator), it is well suited to go through these large ranges.

Lastly, we deal with the sheer amount of possible values by performing a grid search. First, we check every millionth location. After the first match, we stop this search and check the last million locations before the match with a finer grid and so on. The last grid is just 1, so we find our lowest location.

Update: We order our locations from smallest to largest with sort_values and go through them - but the search stops after the first match, since that will be our lowest location.

def grid_search(start: int, end: int, grid: list[int]):
    success = False
    for g in grid:
        print(f'Start search with grid={g}')
        for l in range(start, end, g):
            current = get_seed(l)
            if any((current >= seeds_df.loc[:, 'start']) & (current < seeds_df.loc[:, 'end'])):
                print(f'location {l} is the lowest which contains one of the given seeds ({current})')
                start = l - g
                end = l
                success = True
                break
    return success

for row in humi_loc.sort_values('dest_start').itertuples():
    success = grid_search(start=row.dest_start, end=row.dest_end,
                          grid=[1_000_000, 100_000, 1000, 1])
    if success:
        print('Finished search')
        break

Advent of code 2023 - Tag 5: Wenn Du Einem Samen Dünger Gibst

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 5 - siehe https:adventofcode.com/2023/day/5

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2023 - Day 6: Wait For It

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 6 - see https:adventofcode.com/2023/day/6

Part 1

You manage to sign up as a competitor in the boat races just in time. The organizer explains that it’s not really a traditional race - instead, you will get a fixed amount of time during which your boat has to travel as far as it can, and you win if your boat goes the farthest.

As part of signing up, you get a sheet of paper (your puzzle input) that lists the time allowed for each race and also the best distance ever recorded in that race. To guarantee you win the grand prize, you need to make sure you go farther in each race than the current record holder.

The organizer brings you over to the area where the boat races are held. The boats are much smaller than you expected - they’re actually toy boats, each with a big button on top. Holding down the button charges the boat, and releasing the button allows the boat to move. Boats move faster if their button was held longer, but time spent holding the button counts against the total race time. You can only hold the button at the start of the race, and boats don’t move until the button is released.

For example:

Time:      7  15   30
Distance:  9  40  200

This document describes three races:

The first race lasts 7 milliseconds. The record distance in this race is 9 millimeters.

The second race lasts 15 milliseconds. The record distance in this race is 40 millimeters.

The third race lasts 30 milliseconds. The record distance in this race is 200 millimeters.

Your toy boat has a starting speed of zero millimeters per millisecond. For each whole millisecond you spend at the beginning of the race holding down the button, the boat’s speed increases by one millimeter per millisecond.

So, because the first race lasts 7 milliseconds, you only have a few options:

Don’t hold the button at all (that is, hold it for =0= milliseconds) at the start of the race. The boat won’t move; it will have traveled =0= millimeters by the end of the race.

Hold the button for =1= millisecond at the start of the race. Then, the boat will travel at a speed of 1 millimeter per millisecond for 6 milliseconds, reaching a total distance traveled of =6= millimeters.

Hold the button for =2= milliseconds, giving the boat a speed of 2 millimeters per millisecond. It will then get 5 milliseconds to move, reaching a total distance of =10= millimeters.

Hold the button for =3= milliseconds. After its remaining 4 milliseconds of travel time, the boat will have gone =12= millimeters.

Hold the button for =4= milliseconds. After its remaining 3 milliseconds of travel time, the boat will have gone =12= millimeters.

Hold the button for =5= milliseconds, causing the boat to travel a total of =10= millimeters.

Hold the button for =6= milliseconds, causing the boat to travel a total of =6= millimeters.

Hold the button for =7= milliseconds. That’s the entire duration of the race. You never let go of the button. The boat can’t move until you let go of the button. Please make sure you let go of the button so the boat gets to move. =0= millimeters.

Since the current record for this race is 9 millimeters, there are actually =4= different ways you could win: you could hold the button for 2, 3, 4, or 5 milliseconds at the start of the race.

In the second race, you could hold the button for at least 4 milliseconds and at most 11 milliseconds and beat the record, a total of =8= different ways to win.

In the third race, you could hold the button for at least 11 milliseconds and no more than 19 milliseconds and still beat the record, a total of =9= ways you could win.

To see how much margin of error you have, determine the number of ways you can beat the record in each race; in this example, if you multiply these values together, you get =288= (4 * 8 * 9).

Determine the number of ways you could beat the record in each race. What do you get if you multiply these numbers together?

My input file: 2023-12-06-1-aoc.txt

The solution for this problem is shorter than the task description!

As always - let’s load the data. Note: we use the delim_whitesspace argument instead of sep=' ' to separate values by spaces of any length.

import pandas as pd
import numpy as np

race = pd.read_table('data/2023-12-06-1-aoc.txt', delim_whitespace=True,
                     header=None, index_col=0)
race.index = ['time', 'distance']
race

	1	2	3	4
time	51	92	68	90
distance	222	2031	1126	1225

The we use Python list comprehensions to quickly calculate all distances for the range of the time value (and keep only the winning distances). Then we get the respective lengths and compute the product.

def get_distance(time, distance):
    return [t * (time - t) for t in range(1, time + 1)
            if (t * (time - t)) > distance]

race.loc['nwins', :] = [len(get_distance(int(r.time), int(r.distance)))
                        for c, r in race.items()]
nwins_prod = np.prod(race.loc['nwins', :].values, dtype=int)
display(race)
print(f'The product of our nwins is {nwins_prod}')

	1	2	3	4
time	51.0	92.0	68.0	90.0
distance	222.0	2031.0	1126.0	1225.0
nwins	42.0	19.0	11.0	57.0

The product of our nwins is 500346

Part 2

As the race is about to start, you realize the piece of paper with race times and record distances you got earlier actually just has very bad kerning. There’s really only one race - ignore the spaces between the numbers on each line.

So, the example from before:

Time:      7  15   30
Distance:  9  40  200

…now instead means this:

Time:      71530
Distance:  940200

Now, you have to figure out how many ways there are to win this single race. In this example, the race lasts for =71530= milliseconds and the record distance you need to beat is =940200= millimeters. You could hold the button anywhere from 14 to 71516 milliseconds and beat the record, a total of =71503= ways!

How many ways can you beat the record in this one much longer race?

For this riddle we can take the same approach as for Part 1. The cell took 18s to evaluate on a consumer laptop from 2015 - no fancy workarounds needed :)

racetime = int(''.join(str(r) for r in race.loc['time', :4].astype(int)))
racedist = int(''.join(str(r) for r in race.loc['distance', :4].astype(int)))
racewins = len(get_distance(racetime, racedist))

race[5] = [racetime, racedist, racewins]
race

	1	2	3	4	5
time	51.0	92.0	68.0	90.0	51926890
distance	222.0	2031.0	1126.0	1225.0	222203111261225
nwins	42.0	19.0	11.0	57.0	42515755

Advent of code 2023 - Tag 6: Warte Darauf

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 6 - siehe https:adventofcode.com/2023/day/6

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2023 - Day 7: Camel Cards

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 7 - see https:adventofcode.com/2023/day/7

Part 1

In Camel Cards, you get a list of hands, and your goal is to order them based on the strength of each hand. A hand consists of five cards labeled one of A, K, Q, J, T, 9, 8, 7, 6, 5, 4, 3, or 2. The relative strength of each card follows this order, where A is the highest and 2 is the lowest.

Every hand is exactly one type. From strongest to weakest, they are:

Five of a kind, where all five cards have the same label: AAAAA

Four of a kind, where four cards have the same label and one card has a different label: AA8AA

Full house, where three cards have the same label, and the remaining two cards share a different label: 23332

Three of a kind, where three cards have the same label, and the remaining two cards are each different from any other card in the hand: TTT98

Two pair, where two cards share one label, two other cards share a second label, and the remaining card has a third label: 23432

One pair, where two cards share one label, and the other three cards have a different label from the pair and each other: A23A4

High card, where all cards’ labels are distinct: 23456

Hands are primarily ordered based on type; for example, every full house is stronger than any three of a kind.

If two hands have the same type, a second ordering rule takes effect. Start by comparing the first card in each hand. If these cards are different, the hand with the stronger first card is considered stronger. If the first card in each hand have the same label, however, then move on to considering the second card in each hand. If they differ, the hand with the higher second card wins; otherwise, continue with the third card in each hand, then the fourth, then the fifth.

So, 33332 and 2AAAA are both four of a kind hands, but 33332 is stronger because its first card is stronger. Similarly, 77888 and 77788 are both a full house, but 77888 is stronger because its third card is stronger (and both hands have the same first and second card).

To play Camel Cards, you are given a list of hands and their corresponding bid (your puzzle input). For example:

32T3K 765
T55J5 684
KK677 28
KTJJT 220
QQQJA 483

This example shows five hands; each hand is followed by its bid amount. Each hand wins an amount equal to its bid multiplied by its rank, where the weakest hand gets rank 1, the second-weakest hand gets rank 2, and so on up to the strongest hand. Because there are five hands in this example, the strongest hand will have rank 5 and its bid will be multiplied by 5.

So, the first step is to put the hands in order of strength:

32T3K is the only one pair and the other hands are all a stronger type, so it gets rank 1.

KK677 and KTJJT are both two pair. Their first cards both have the same label, but the second card of KK677 is stronger (K vs T), so KTJJT gets rank 2 and KK677 gets rank 3.

T55J5 and QQQJA are both three of a kind. QQQJA has a stronger first card, so it gets rank 5 and T55J5 gets rank 4.

Now, you can determine the total winnings of this set of hands by adding up the result of multiplying each hand’s bid with its rank (765 * 1 + 220 * 2 + 28 * 3 + 684 * 4 + 483 * 5). So the total winnings in this example are =6440=.

Find the rank of every hand in your set. What are the total winnings?

My input file: 2023-12-07-1-aoc.txt

As always - let’s read in the data.

import pandas as pd

camel = pd.read_table('data/2023-12-07-1-aoc.txt', delim_whitespace=True,
                      header=None, names=['cards', 'bid'],
                      dtype={'cards': 'string', 'bid': 'Int64'})
camel

	cards	bid
0	A833A	309
1	Q33J3	205
2	55KK5	590
3	K4457	924
4	Q3QT3	11
…	…	…
995	KQJ53	367
996	57866	537
997	Q94A9	210
998	J448A	903
999	6J66J	114

1000 rows × 2 columns

For sorting, we need to compute a type column. The 7 types can be deduced based on how many different cards are there (through set()) and the number of cards.

Also, for the next step we prepare one column per card string in the cards.

def get_type(cards):
    cardset = set(cards)
    cardlen = len(cardset)
    cardlist = list(cards)
    cardnum = [cardlist.count(i) for i in cardset]
    if cardlen == 1:
        ctype = 1
    elif cardlen in (4, 5):
        ctype = cardlen + 2
    elif cardlen == 2:
        ctype = 2 if set(cardnum) == {1, 4} else 3
    else:
        ctype = 4 if 3 in cardnum else 5
    return ctype

camel['type'] = camel.loc[:, 'cards'].apply(lambda x: get_type(x))
for i in range(5):
    camel[f'c{i}'] = camel.loc[:, 'cards'].apply(lambda x: list(x)[i])
camel

	cards	bid	type	c0	c1	c2	c3	c4
0	A833A	309	5	A	8	3	3	A
1	Q33J3	205	4	Q	3	3	J	3
2	55KK5	590	3	5	5	K	K	5
3	K4457	924	6	K	4	4	5	7
4	Q3QT3	11	5	Q	3	Q	T	3
…	…	…	…	…	…	…	…	…
995	KQJ53	367	7	K	Q	J	5	3
996	57866	537	6	5	7	8	6	6
997	Q94A9	210	6	Q	9	4	A	9
998	J448A	903	6	J	4	4	8	A
999	6J66J	114	3	6	J	6	6	J

1000 rows × 8 columns

For sorting, pandas contributes the sort_values method, which supports sorting by multiple columns (here, we first sort according to type, then c0, c1, …). For easier sorting by numerical values, we convert the values using the card_dict dictionary, which implements the given sorting order.

card_dict = {c: i for i, c in enumerate([7, 6, 5, 4, 3, 2, 1, '2', '3',
                                         '4', '5', '6', '7', '8', '9',
                                         'T', 'J', 'Q', 'K', 'A'])}

camel = camel.sort_values(by=['type', 'c0', 'c1', 'c2', 'c3', 'c4'],
                          ignore_index=True,
                          key=lambda x: x.map(card_dict))
camel.index += 1
camel

	cards	bid	type	c0	c1	c2	c3	c4
1	237AQ	157	7	2	3	7	A	Q
2	23K47	759	7	2	3	K	4	7
3	249K8	612	7	2	4	9	K	8
4	264A7	341	7	2	6	4	A	7
5	26578	10	7	2	6	5	7	8
…	…	…	…	…	…	…	…	…
996	AA5AA	565	2	A	A	5	A	A
997	AAATA	704	2	A	A	A	T	A
998	AAAA3	145	2	A	A	A	A	3
999	AAAAJ	594	2	A	A	A	A	J
1000	JJJJJ	219	1	J	J	J	J	J

1000 rows × 8 columns

Lastly, we multiply the bid with the index and sum the result.

camel.loc[:, 'bid'].mul(camel.index).sum()

Part 2

To make things a little more interesting, the Elf introduces one additional rule. Now, J cards are jokers - wildcards that can act like whatever card would make the hand the strongest type possible.

To balance this, =J= cards are now the weakest individual cards, weaker even than 2. The other cards stay in the same order: A, K, Q, T, 9, 8, 7, 6, 5, 4, 3, 2, J.

J cards can pretend to be whatever card is best for the purpose of determining hand type; for example, QJJQ2 is now considered four of a kind. However, for the purpose of breaking ties between two hands of the same type, J is always treated as J, not the card it’s pretending to be: JKKK2 is weaker than QQQQ2 because J is weaker than Q.

Now, the above example goes very differently:

32T3K 765
T55J5 684
KK677 28
KTJJT 220
QQQJA 483

32T3K is still the only one pair; it doesn’t contain any jokers, so its strength doesn’t increase.

KK677 is now the only two pair, making it the second-weakest hand.

T55J5, KTJJT, and QQQJA are now all four of a kind! T55J5 gets rank 3, QQQJA gets rank 4, and KTJJT gets rank 5.

With the new joker rule, the total winnings in this example are =5905=.

Using the new joker rule, find the rank of every hand in your set. What are the new total winnings?

What changes? We have to update our get_type() function and our card_dict dictionary, otherwise the method stays the same!

Let’s first write get_type_joker(): if no joker is in the cards, we can just refer to get_type(). If we have 1, 2 or 5 different card faces, the answer is trivial. We only have to use the number of jokers and the maximum number of other faces if we have 3 or 4 different faces.

Note: if a J is present, there can never be high card (type = 7), because we always can get one pair. Also we can never get two pair (type = 5), because we can always construct three of a kind or full house.

def get_type_joker(cards):
    cardset = set(cards)
    cardlen = len(cardset)
    cardlist = list(cards)
    cardnum = [cardlist.count(i) for i in cardset]
    if not 'J' in cardset:
        ctype = get_type(cards)
    else:
        idxj = list(cardset).index('J')
        numj = cardnum[idxj]
        cardnum.pop(idxj)
        if cardlen in [1, 2]:
            ctype = 1
        elif cardlen == 5:
            ctype = 6
        else:
            if (numj in [1, 2]) and (cardlen == 4):
                ctype = 4
            else:
                ctype = 6 - max(cardnum) - numj
    return ctype

camel['type'] = camel.loc[:, 'cards'].apply(lambda x: get_type_joker(x))

camel

	cards	bid	type	c0	c1	c2	c3	c4
1	237AQ	157	7	2	3	7	A	Q
2	23K47	759	7	2	3	K	4	7
3	249K8	612	7	2	4	9	K	8
4	264A7	341	7	2	6	4	A	7
5	26578	10	7	2	6	5	7	8
…	…	…	…	…	…	…	…	…
996	AA5AA	565	2	A	A	5	A	A
997	AAATA	704	2	A	A	A	T	A
998	AAAA3	145	2	A	A	A	A	3
999	AAAAJ	594	1	A	A	A	A	J
1000	JJJJJ	219	1	J	J	J	J	J

1000 rows × 8 columns

Now, we just move J in our new card_dict_joker to where it belongs, the rest of the solution is the same.

card_dict_joker = {c: i for i, c in enumerate(
    [7, 6, 5, 4, 3, 2, 1, 'J', '2', '3', '4', '5', '6', '7', '8', '9',
     'T', 'Q', 'K', 'A'])}

camel = camel.sort_values(by=['type', 'c0', 'c1', 'c2', 'c3', 'c4'],
                          ignore_index=True,
                          key=lambda x: x.map(card_dict_joker))
camel.index += 1
display(camel)
print('Our new solution: ', camel.loc[:, 'bid'].mul(camel.index).sum())

	cards	bid	type	c0	c1	c2	c3	c4
1	237AQ	157	7	2	3	7	A	Q
2	23K47	759	7	2	3	K	4	7
3	249K8	612	7	2	4	9	K	8
4	264A7	341	7	2	6	4	A	7
5	26578	10	7	2	6	5	7	8
…	…	…	…	…	…	…	…	…
996	TJTTJ	609	1	T	J	T	T	J
997	QQJQJ	131	1	Q	Q	J	Q	J
998	QQQJQ	831	1	Q	Q	Q	J	Q
999	KKKJK	183	1	K	K	K	J	K
1000	AAAAJ	594	1	A	A	A	A	J

1000 rows × 8 columns

Our new solution:  246894760

Advent of code 2023 - Tag 7: Kamelkarten

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2023. Dies ist Tag 7 - siehe https:adventofcode.com/2023/day/7

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2024 - Day 1: Historian Hysteria

I try to solve this year’s Advent of Code 2024 riddles in R. This is Day 1 - see https:adventofcode.com/2024/day/1

Part 1: Locations

Throughout the Chief’s office, the historically significant locations are listed not by name but by a unique number called the location ID. To make sure they don’t miss anything, The Historians split into two groups, each searching the office and trying to create their own complete list of location IDs.

There’s just one problem: by holding the two lists up side by side (your puzzle input), it quickly becomes clear that the lists aren’t very similar. Maybe you can help The Historians reconcile their lists?

For example:

Maybe the lists are only off by a small amount! To find out, pair up the numbers and measure how far apart they are. Pair up the smallest number in the left list with the smallest number in the right list, then the second-smallest left number with the second-smallest right number, and so on.

Within each pair, figure out how far apart the two numbers are; you’ll need to add up all of those distances. For example, if you pair up a 3 from the left list with a 7 from the right list, the distance apart is 4; if you pair up a 9 with a 3, the distance apart is 6.

In the example list above, the pairs and distances would be as follows:

The smallest number in the left list is 1, and the smallest number in the right list is 3. The distance between them is =2=.

The second-smallest number in the left list is 2, and the second-smallest number in the right list is another 3. The distance between them is =1=.

The third-smallest number in both lists is 3, so the distance between them is =0=.

The next numbers to pair up are 3 and 4, a distance of =1=.

The fifth-smallest numbers in each list are 3 and 5, a distance of =2=.

Finally, the largest number in the left list is 4, while the largest number in the right list is 9; these are a distance =5= apart.

To find the total distance between the left list and the right list, add up the distances between all of the pairs you found. In the example above, this is 2 + 1 + 0 + 1 + 2 + 5, a total distance of =11=!

Your actual left and right lists contain many location IDs. What is the total distance between your lists?

My input file: 2024-12-01-1-aoc.txt

First, we read the data - the fread command from the data.table package is more versatile than read.delim from base and directly reads the data as a data.table, which has some benefits.

data <- data.table::fread("2024-12-01-1-aoc.txt", header = FALSE)
head(data)

V1	V2
<int>	<int>
41226	69190
89318	10100
59419	23880
63157	20193
81510	22869
83942	63304

Now, the idea is to get the ordered versions of V1 and V2.

v1 <- data[order(V1)]$V1
v2 <- data[order(V2)]$V2
head(v1)

Coming from Python, this syntax looks a bit odd. The documentation of data.table::setorder helps:

setorder (and setorderv) reorders the rows of a data.table based on the columns (and column
order) provided. It reorders the table by reference and is therefore very memory efficient.

Note that queries like x[order(.)] are optimised internally to use data.table's fast order.

So, data[order(V1)] is actually short for data.table::setorder(data, V1). Then, we extract the vector by name using the $ operator, which allows to extract elements by name.

head(data.table::setorder(data, V1)$V1)

The actual computation is just the sum of the absolute difference:

sum(abs(v1 - v2))

Part 2: Similarities

This time, you’ll need to figure out exactly how often each number from the left list appears in the right list. Calculate a total similarity score by adding up each number in the left list after multiplying it by the number of times that number appears in the right list.

Here are the same example lists again:

For these example lists, here is the process of finding the similarity score:

The first number in the left list is 3. It appears in the right list three times, so the similarity score increases by 3 * 3 = 9.

The second number in the left list is 4. It appears in the right list once, so the similarity score increases by 4 * 1 = 4.

The third number in the left list is 2. It does not appear in the right list, so the similarity score does not increase (2 * 0 = 0).

The fourth number, 1, also does not appear in the right list.

The fifth number, 3, appears in the right list three times; the similarity score increases by =9=.

The last number, 3, appears in the right list three times; the similarity score again increases by =9=.

So, for these example lists, the similarity score at the end of this process is =31= (9 + 4 + 0 + 0 + 9 + 9).

Once again consider your left and right lists. What is their similarity score?

My idea here is simple - we first count the occurrences of V2 to be able to check V1 against them. Here, applying the table command works like pandas.value_counts() and achieves this. We convert the output to a data.frame and assign the V1 values as row names. Note: table converts the values which are counted to string, e.g. "10019" instead of 10019.

tab <- table(v2)
v2series <- data.frame(c(tab), row.names = names(tab))
head(v2series)
v2series["10019", ]

	c.tab.
	<int>
10019	1
10100	1
10206	1
10428	1
10645	1
10972	1

Now, we loop through V1 and try to access the frequency in V2 (by converting to string first, via as.character). If nothing is found in V2, an NA value is returned. For everything that is not NA, we compute the similarity score, append it to a list, and sum in the end. Note: a list in R can not directly be summed (I don’t know why that is) - so we have to unlist first.

sim <- list()
for (num in v1) {
    myfreq <- v2series[as.character(num), ]
    if (!is.na(myfreq)) {
        score <- myfreq * num
        sim <- append(sim, score)
    }

}
sum(unlist(sim))

Advent of code 2024 - Tag 1: Hysterie der Historiker:innen

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2024 in der auf Statistik fokussierten Programmiersprache R. Dies ist Tag 1 - siehe https:adventofcode.com/2024/day/1

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2024 - Day 2: Red-Nosed Reports

I try to solve this year’s Advent of Code 2024 riddles in R. This is Day 2 - see https:adventofcode.com/2024/day/2

Part 1: Safe reports

The unusual data (your puzzle input) consists of many reports, one report per line. Each report is a list of numbers called levels that are separated by spaces. For example:

This example data contains six reports each containing five levels.

The engineers are trying to figure out which reports are safe. The Red-Nosed reactor safety systems can only tolerate levels that are either gradually increasing or gradually decreasing. So, a report only counts as safe if both of the following are true:

The levels are either all increasing or all decreasing.

Any two adjacent levels differ by at least one and at most three.

In the example above, the reports can be found safe or unsafe by checking those rules:

7 6 4 2 1: Safe because the levels are all decreasing by 1 or 2.

1 2 7 8 9: Unsafe because 2 7 is an increase of 5.

9 7 6 2 1: Unsafe because 6 2 is a decrease of 4.

1 3 2 4 5: Unsafe because 1 3 is increasing but 3 2 is decreasing.

8 6 4 4 1: Unsafe because 4 4 is neither an increase or a decrease.

1 3 6 7 9: Safe because the levels are all increasing by 1, 2, or 3.

So, in this example, =2= reports are safe.

Analyze the unusual data from the engineers. How many reports are safe?

My input file: 2024-12-02-1-aoc.txt

First, let’s load the data and take a look:

data <- data.table::fread("2024-12-02-1-aoc.txt", header = FALSE, fill = TRUE)
head(data)

V1	V2	V3	V4	V5	V6	V7	V8
<int>	<int>	<int>	<int>	<int>	<int>	<int>	<int>
74	76	78	79	76	NA	NA	NA
38	40	43	44	44	NA	NA	NA
1	2	4	6	8	9	13	NA
65	68	70	72	75	76	81	NA
89	91	92	95	93	94	NA	NA
15	17	16	18	19	17	NA	NA

My core idea here was to compute the differences between each element in a row - luckily, R provides an inbuilt diff() function that provides exactly what I need. Then, we check if all differences are either 1, 2, 3 or -1, -2, -3. Note that each row has a different number of elements, so we have to omit NA values via na.omit when applying the check.

isalltrue <- function(rep) {
    return(all(diff(rep) %in% c(1, 2, 3)) |
           all(diff(rep) %in% c(-1, -2, -3)))
}

isreport <- function(row) {
    return(isalltrue(na.omit(row)))
}
valid.reports <- apply(data, 1, isreport)
table(valid.reports)

Part 2: Problem Dampener

The engineers are surprised by the low number of safe reports until they realize they forgot to tell you about the Problem Dampener.

The Problem Dampener is a reactor-mounted module that lets the reactor safety systems tolerate a single bad level in what would otherwise be a safe report. It’s like the bad level never happened!

Now, the same rules apply as before, except if removing a single level from an unsafe report would make it safe, the report instead counts as safe.

More of the above example’s reports are now safe:

-=7 6 4 2 1=: Safe without removing any level.

-=1 2 7 8 9=: Unsafe regardless of which level is removed.

-=9 7 6 2 1=: Unsafe regardless of which level is removed.

-=1 3 2 4 5=: Safe by removing the second level, 3.

-=8 6 4 4 1=: Safe by removing the third level, 4.

-=1 3 6 7 9=: Safe without removing any level.

Thanks to the Problem Dampener, =4= reports are actually safe!

Update your analysis by handling situations where the Problem Dampener can remove a single level from unsafe reports. How many reports are now safe?

So, this task is a good example on how easy it is to get lost in a overly complicated approach. My first approach was basically to somehow check if the general trend was descending or ascending and based on that to remove opposite differences. This was too complicated and in the end I found 4 less safe reports than were needed… (Also note the beautiful print debuggin…)

My actual working approach was in the end brute force. I just threw out each element in a row and checked if the report became positive - much simpler and effective!

isreport2 <- function(row) {
    rep <- na.omit(row)
    out <- isalltrue(rep)
    if (out == FALSE) {
        for (i in seq_along(rep)) {
            # rep <- rep[-i] does not update the variable
            # with rep.i, a new variable is assigned (rep.1, rep.2, ...)
            rep.i <- rep[-i]
            out <- isalltrue(rep.i)
            if (out == TRUE) {
                break
            }
        }
    }
    return(out)
}
sum(apply(data, 1, isreport2))

Advent of code 2024 - Tag 2: Rotnasige Berichte

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2024 in der auf Statistik fokussierten Programmiersprache R. Dies ist Tag 2 - siehe https:adventofcode.com/2024/day/2

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Advent of code 2024 - Day 3: Mull It Over

I try to solve this year’s Advent of Code 2024 riddles in R. This is Day 3 - see https:adventofcode.com/2024/day/3

Part 1: mul()

The computer appears to be trying to run a program, but its memory (your puzzle input) is corrupted. All of the instructions have been jumbled up!

It seems like the goal of the program is just to multiply some numbers. It does that with instructions like mul(X,Y), where X and Y are each 1-3 digit numbers. For instance, mul(44,46) multiplies 44 by 46 to get a result of 2024. Similarly, mul(123,4) would multiply 123 by 4.

However, because the program’s memory has been corrupted, there are also many invalid characters that should be ignored, even if they look like part of a mul instruction. Sequences like mul(4*, mul(6,9!, ?(12,34), or mul ( 2 , 4 ) do nothing.

For example, consider the following section of corrupted memory:

=xmul(2,4)%&mul[3,7]!@^do_not_mul(5,5)+mul(32,64]then(mul(11,8)mul(8,5)) Only the four highlighted sections are real mul instructions. Adding up the result of each instruction produces 161 (2*4 + 5*5 + 11*8 + 8*5).

Scan the corrupted memory for uncorrupted mul instructions. What do you get if you add up all of the results of the multiplications?

My input file: 2024-12-03-1-aoc.txt

First, let’s load the data and take a look:

library(tidyverse)

connection <- file("2024-12-03-1-aoc.txt", open = "r")
lines <- readLines(connection)
print(lines[1])

mullist <- stringr::str_extract_all(lines, "mul\\(\\d*,\\d*\\)")
mullist <- unlist(mullist, recursive = FALSE)
head(mullist)

df <- mullist |>
  data.frame() |>
  magrittr::set_colnames(c("muls")) |>
  tidyr::separate_wider_regex(
           cols = "muls",
           patterns = c(
             "mul\\(",
             mul1 = "\\d*",
             ",",
             mul2 = "\\d*"
           ),
           too_few = "align_start"
         ) |>
  mutate(
    mul1 = as.integer(mul1),
    mul2 = as.integer(mul2),
    result = mul1 * mul2
  )
head(df)

mul1	mul2	result
<int>	<int>	<int>
948	148	140304
590	32	18880
611	372	227292
835	665	555275
724	851	616124
188	482	90616

sum(df$result)

Part 2: do() and don’t()

There are two new instructions you’ll need to handle:

The do() instruction enables future mul instructions.

The don't() instruction disables future mul instructions.

Only the most recent do() or don't() instruction applies. At the beginning of the program, mul instructions are enabled.

For example:

xmul(2,4)&mul[3,7]!^don't()_mul(5,5)+mul(32,64](mul(11,8)undo()?mul(8,5))

This corrupted memory is similar to the example from before, but this time the mul(5,5) and mul(11,8) instructions are disabled because there is a don't() instruction before them. The other mul instructions function normally, including the one at the end that gets re-*enabled* by a do() instruction.

This time, the sum of the results is =48= (2*4 + 8*5).

Handle the new instructions; what do you get if you add up all of the results of just the enabled multiplications?

https://stackoverflow.com/questions/36056219/regex-match-first-occurrence-before-keyword

lines[[4]]

for(i in 1:6) {
  varname <- paste("dolistf", i, sep="")
  assign(varname, stringr::str_extract(
                             lines[[i]], "^(?:.*?)don't\\(\\)"))
  varname <- paste("dolistl", i, sep="")
  assign(varname, stringi::stri_extract_last_regex(
                             lines[[i]],
                             "(do\\(\\)).*?(^don't\\(\\))"))

}

## dolist7 <- stringr::str_extract_all(lines, "do\\(\\)(?:.*?)don't\\(\\)")
dolist7 <- stringr::str_extract_all(lines,
                                    "^(?:.*?)don't\\(\\)")
dolist8 <- stringr::str_extract_all(lines,
                                    "(?<!(don't\\(\\)))(do\\(\\)(?:.*?)don't\\(\\))")
dolist9 <- stringr::str_extract_all(lines,
                                    "do\\(\\)(?:.(?!(don't\\(\\))))+$")
dolist7 <- unlist(dolist7, recursive = FALSE)
dolist <- c(dolistf1, dolistf2, dolistf3, dolistf4, dolistf5, dolistf6,
            dolistl1, dolistl2, dolistl3, dolistl4, dolistl5, dolistl6,
            dolist7)
dolist8

mullist <- stringr::str_extract_all(dolist7, "mul\\(\\d*,\\d*\\)")
mullist <- unlist(mullist, recursive = FALSE)
length(mullist)

df <- mullist |>
  data.frame() |>
  magrittr::set_colnames(c("muls")) |>
  tidyr::separate_wider_regex(
           cols = "muls",
           patterns = c(
             "mul\\(",
             mul1 = "\\d*",
             ",",
             mul2 = "\\d*"
           ),
           too_few = "align_start"
         ) |>
  mutate(
    mul1 = as.integer(mul1),
    mul2 = as.integer(mul2),
    result = mul1 * mul2
  )
head(df)

mul1	mul2	result
<int>	<int>	<int>
948	148	140304
590	32	18880
611	372	227292
835	665	555275
416	366	152256
459	47	21573

sum(df$result)

Advent of code 2024 - Tag 3: …

Dieses Jahr versuche ich mich an den Herausforderungen des Advent of Code 2024 in der auf Statistik fokussierten Programmiersprache R. Dies ist Tag 2 - siehe https:adventofcode.com/2024/day/2

Für die Lösungen siehe die englische Version dieses Blogbeitrages.

Yellow Wallpaper and a king in all things - Reads June 2024

Charlotte Perkins Gilman: The Yellow Wallpaper
- A classic of feminist literature with a soft horror touch - a quick read (only 24 pages) and definitely worth it.
Herta Müller: Der König verneigt sich und tötet (essay collection)
- Jede Sprache hat andere Augen
- Der König verneigt sich und tötet

The Yellow Wallpaper und der König in den Dingen - Leseliste Juni 2024

Connection on Mount Misery vs Corona

Eine Grundüberlegung bei Noah Sham’s Mount Misery: It is not self that heals, but connection.

Corona: Connection (soziale Nähe) tötet.

Verbindung in Mount Misery vs Corona

Siehe Englische Version.

Thoughts on Thea Dorn about consolation in pandemic times

See German version.

Gedanken zu Thea Dorn über Trost und Religiösität in Zeiten der Pandemie

In einer kürzlich erschienenen Folge des ZEIT-Podcasts “Alles gesagt” besprechen Thea Dorn, Jochen Wegner, und Christian Amend das Buch Trost. Briefe an Max.. Sie nähern sich einem Kernproblem des modernen Menschen: wir seien darauf aus, entweder vor einem unerwünschten Ereignis Risiken zu minimieren, oder nachher Schadenersatz und Verantwortung zu fordern. Wir hätten verlernt, mit den Unwägbarkeiten des Lebens umzugehen. Religiösität mit ihrer trostspendenen Wirkung sei der entscheidende Ausweg.

Punkte, die ich ansprechen möchte:

Thea Dorn zitiert NIetzsche platt als “Gott ist tot, der Mensch ist Gott, etc” → eingehen auf Nietzsche’s “Gott ist tot” als Aufruf an Mensch in der “aufgeklärten” Welt, dass es eine neue Begründung für basale Dinge wie Menschenrechte, oder auch Trost, braucht. Glauben als mentaler Trick. “Gott ist tot” als Aufruf und mit Hoffnung besetzt, eine nachhaltigere, tiefere Basis zu finden.
Sie geht zurück zur Religiösität. Ich als ostdeutsch sozialisiert kann nicht zurückgehen. Ich wurde als Kind nicht an Glauben herangeführt, es bleibt für mich ein unzugängliches Thema, welches nicht als ehrlich und wahren Trost annehmen kann. Ich weiß unverrückbar, dass das Leben ein kurzer, zufälliger Moment ist, dass nach dem Tod nichts ist, dass das Leben am Ende keinen höheren Sinn hat. Dies ist mein Ausgangspunkt. Ist dies der Grund, warum ich bei diesen Überlegungen keine Hoffnungslosigkeit spüre? Ich muss nicht dabei verweilen, ich kann einen Schritt weitergehen: letztendlich zählt das, was ich in meinem jetzigen Leben tue.
Dass es Leben gibt, ist chemischer Zufall, dass es menschliches Leben gibt, ist evolutionärer Zufall, dass ich in einem reichen Land geboren werde, ist geschichtlicher Zufall, dass meine Eltern mich gut versorgen und großziehen können, ist familiärer Zufall. Brauchen wir eine Zufallsethik? Wenn so viele Teile unserer Existenz Zufall sind, ist das genug, um Trost zu spenden?
Trifft das Unheil ein, ist dies ein Schicksalsschlag, aber nicht auch eine statistische Größe? Das Medizinstudium zeigt mir: die ganze Zeit erkranken Menschen, während ich oft glimpflich davonkomme. Ich bin nicht mit 18J in einen tödlichen Verkehrsunfall verwickelt worden. Ich bin nicht mit 23J an akuter Leukämie gestorben. Sehe ich diese statistischen Bedrohungen die ganze Zeit als reale Bedrohungen, werde ich gewiss verrückt, verzweifeln, hoffnungslos.
Ich bin es jedoch nicht. Warum nicht? Gewiss aufgrund einer gewissen stoischen Haltung. Das reicht aber nicht, es braucht auch einen aktiven Part - Risikominimierung bietet letztlich dieses Gewissen: “ich habe mein möglichstes getan”. Christian Drosten hatte eine ähnliche Losung: wir müssen besorgt sein und Maßnahmen treffen - aber nicht panisch.

Weiteres Thema: Thea Dorn sagt, Naturwissenschaftler*innen wägen gegeneinander ab: wir rechnen aus, was wieviel Leid bringt, dann nehmen wir das, was weniger bringt. Das sei Utilitarismus. Sie jedoch sagt: wir müssen abwägen von Leben retten vs “Freiheit”.

mit Friederike Schmitz: Abwägen im juristischen Sinne IST Utilitarismus!

Reading “The Shadow of the Sun” by Ryszard Kapuściński

State “TODO” from “PENDING” [2021-01-13 Mi 23:24]
State “PENDING” from “DONE” [2021-01-13 Mi 23:24]

General impression: thoughtful book, mostly he tries to avoid single stories and gives an impression of different scenes in or between different African countries from the end of the 1950s till the 1990s. From time to time he tries to explain certain view or mentalities in African countries, which might come of as condescending from todays view and be of more interest to non-African readers than to African readers.

In this blog post I will share the more impressionist quotes I liked, while some historical takeaways will be collected in Notes

p 29: Course and temperature of the first greeting defines fate of the relationship

from very first second: exhibit enormous primal joy and geniality
extend hand in large, vigorous gesture
loud laughter, many questions

p 36ff.: colonial heritage

a state wherein the civil servant received renumeration beyond all measure and reason (white low burocratic suddenly gets villa, servants, …)
- after independence this system gives fast rise to new elites
- french: la politique du ventre (… of the stomach)
10.000 kingodms, federations and stateless, but independent tribal associations crammed in ~40 colonies! (without asking)
ports - only leeches on the body of Africa, points of export for slaves, gold and ivory
p52: Why Indians built the railway
- White worker from Europe → was master, could not do physical labour
- African worker → “did not exist”, the concept of wages was missing and British had system of forced labour later
p82: Islands around Africa
- Were bases for sailors, merchants, and robbers (especially Europeans)
- For unstable African boats hard to reach → spot for concentration camps for slave trade
p83: The philosophy that inspired the construction of Kolyma and Auschwitz, one of obsessive content and hatred, vileness and brutality, was formulated and set down centuries earlier by the captains of the Martha and the Progress, the Marie Ann and the Rainbow

p60: Secrets of Serengeti

lions attacking humans → are old outcasts
where are the elephant cemetries → on bottom of lakes

Europeans, Africans and Racism

p70: Poland vs Tansania
- children asking Ryszard in Poland “And did you see many cannibals?”
- Mothers in Tanganyika to their children: “You had better be good, or else the mzungu (Swahili: the white man, the European) will eat you!”
p110: So often I had felt irritated with people who arrived here, lived in “little Europe” or “little America” (e.g. in luxury hotels), and departed, bragging later that they had been to Africa, a place that in reality they had never seen.

wonders

Lalibela: 11 great churches carved in stone and misery

FilesExpand file tree

source.org

Latest commit

History

source.org

File metadata and controls

@aseltmann

Homepage

Hi, I’m Alva

Contact

Homepage_de

Hallo, ich bin Alva

Kontakt

Publications

Publications

Publikationen

Publikationen

Notes

Notes

Poetry

Lǎozǐ

Aphorisms

Test page

Year numbering system

Buttons

Columns

Expand

Hints

Display Mode (KaTeX)

Generate SVG charts and diagrams for text (Mermaid)

Tabs

Science & Medicine

Critical test section

test section

Other critical stuff

International

Africa

Notes on specific countries

Phases after WWII

1st phase: 50s/60s - rapid decolonization, optimism, euphoria

2nd phase: 70s/80s - rapid growth of population, pessimism

Asia

Section Philosophy

Philosophy

A overview from

Notizen

Notizen

Gedichte

Erich Fried

Lǎozǐ

Aphorismen

Sektion Philosopie

Philosopie

Ein Überblick aus “Sophies Welt”

Blog Posts

Test post

Test post deutsch

Advent of code 2023 - Day 1: Trebuchet?!

Part 1

Part 2

Advent of code 2023 - Tag 1: Trebuchet?!

Advent of code 2023 - Day 2: Cube Conundrum

Part 1

Part 2

Advent of code 2023 - Tag 2: Würfelrätsel

Advent of code 2023 - Day 3: Gear Ratios

Part 1

Part 2

Advent of code 2023 - Tag 3: Getriebeübersetzungen

Advent of code 2023 - Day 4: Scratchcards

Part 1

Part 2

Advent of code 2023 - Tag 4: Rubbelkarten

Advent of code 2023 - Day 5: If You Give A Seed A Fertilizer

Part 1

Part 2

Advent of code 2023 - Tag 5: Wenn Du Einem Samen Dünger Gibst

Advent of code 2023 - Day 6: Wait For It

Part 1

Part 2