Today was hard, but a good challenge. I haven’t written a recursive function since college CS – is that typical for data science / analysis work? I don’t see much about recursion on #rstats Twitter. Recursion feels like a separate way of thinking, and I had forgotten how.
Part 1
The first part was satisfying. Data cleaning was quick and I thought joining the data.frame to itself to populate parent and child for each entry was a nice touch.
test <- read.delim("07_1_test_dat.txt", header = FALSE, stringsAsFactors = FALSE)
library(pacman)
p_load(tidyr, splitstackshape, readr, dplyr, stringr)
# Tidy the data
tidy_puzzle_input <- function(x){
result <- x %>%
separate(V1, into = c("name", "child"), sep = "->") %>%
cSplit(., "child", ",", "long", type.convert = FALSE) %>%
mutate(weight = parse_number(name),
name = str_extract(name, "[A-z]+"))
# Join to itself to attach parent name
left_join(
result,
result %>%
select(-weight) %>%
rename(name = child, parent = name)
)
}
My tidy data looked like:
name child weight parent
1 jlbcwrl <NA> 93 tzrppo
2 fzqsahw lybovx 256 peuppj
3 fzqsahw pdmhva 256 peuppj
4 rxivjo mewof 206 ikdsvc
5 rxivjo hrncqs 206 ikdsvc
Then the solution was easy:
# the bottom node will be the only one that doesn't have a parent
test %>%
tidy_puzzle_input %>%
filter(is.na(parent))
read.delim("07_1_dat.txt", header = FALSE, stringsAsFactors = FALSE) %>%
tidy_puzzle_input %>%
filter(is.na(parent))
Part 2
This was tough going. I got a function working for the sample input, which worked by stepping backward from the most distant children and adding their weight to their parents, then removing them and calling itself again.
But while the sample input had symmetric towers, in the bigger test data a tower could be balanced if it had two children of `4` and `2 -> 1 + 1`. In that scenario, you can’t peel off the 4 in the same step that you peel off the 1s. (For my own satisfaction, that false solution appears far below).
I’m proud of my eventual solution. And besides finally getting my brain to think recursively, I learned a few things: creating a named vector with `structure()` and using `purrr::map_dbl`.
get_faulty_set <- function(dat){
get_node_children <- function(node_name){
dat$child[dat$name == node_name]
}
# recursively get the cumulative weight of a node
get_node_weight <- function(node_name){
if(is.na(get_node_children(node_name))[1]){
dat$weight[dat$name == node_name][1]
} else{
dat$weight[dat$name == node_name][1] +
sum(sapply(get_node_children(node_name), get_node_weight))
}
}
# Grab parents whose children have multiple weights - the problem cascades
faulty_parents <- dat %>%
rowwise %>%
mutate(total_weight = get_node_weight(name)) %>%
group_by(parent) %>%
summarise(num_wts = n_distinct(total_weight)) %>%
filter(num_wts > 1) %>%
pull(parent)
# find the lowest parent where the problem occurs
lowest_bad_parent <- dat %>%
filter(name %in% faulty_parents) %>%
filter(! name %in% .$parent) %>%
slice(1) %>%
pull(name)
# Get the weights & names of that parent's children
bad_set <- structure(
purrr::map_dbl(get_node_children(lowest_bad_parent), get_node_weight),
names = get_node_children(lowest_bad_parent)
)
bad_set
# From here it's common sense to spot the odd node out, see how much it needs to change;
# Then fetch its weight using get_node_weight and compute the sum
}
I finished with a cop-out: once my function returned the weights of the subtower nodes where the problem existed and their names, it wasn’t worth programming the last bit to do the weight subtraction and get my answer. With this output:
dkvzre awufxne osbbt ycbgx wdjzjlk
2255 2255 2255 2260 2255
I just calculated it myself using `get_node_weight(“ycbqx”)` and subtracting 5.
Appendix: Solution with Symmetric Towers
Because I feel salty that this didn’t work.
find_unbalanced_tower <- function(x){
lowest_children <- x %>%
filter(is.na(child)) %>%
distinct(.keep_all = TRUE)
res <- lowest_children %>%
group_by(parent) %>%
summarise(uniques = n_distinct(weight),
total_wt = sum(weight))
# check for completion and return result if done
if(!all(res$uniques == 1)){
problem_parent <- res$parent[res$uniques == 2]
print(paste("Problem parent is: ", problem_parent))
lowest_children %>%
filter(parent == problem_parent) %>%
group_by(name) %>%
slice(1)
} else {
# Then add their weight to parent weight, remove them, update parents to have no children
x <- x %>%
mutate(child = ifelse(child %in% lowest_children$name, NA, child)) %>%
filter(!name %in% lowest_children$name) %>%
rowwise() %>%
mutate(weight = weight + sum(res$total_wt[res$parent == name])) %>%
ungroup() %>%
distinct(.keep_all = TRUE)
find_unbalanced_tower(x)
}
}
my_input %>%
tidy_puzzle_input %>%
find_unbalanced_tower()