TL;DR

I setup a local workflow that allows me to turn a webpage to an epub on my android phone and send it to my Kobo

Introduction

Since Mozilla killed Pocket, i have been looking for an alternative that didn’t depend on decisions from any tech company, but only on myself.

I used the Pocket feature quite a lot, and, even if I appreciated the effort from Kobo to replace it with Instapaper, I didn’t want to depend on someone else for something as simple as reading an article later on my eink device.

I considered Wallabag and Readeck, but, for both I had to depend on someone else server, or I had to self-host, and I didn’t want to deal with the complexity.

I wanted an approach where I was in control, so all the steps needed to be based on FOSS software that I could at least understand.

The basic idea

I thought that what I needed is a 2 step approach, and I could solve both of them

  1. Turn a webpage into an epub
  2. Send the epub to my kobo

The explanation below is long, but, especuially following step 1-a and step 2-a is fairly easy and doesn’t involve any modification or coding

Step 1: Turn a webpage into an epub

In the long search to do this I ended up finding 2 apporaches, on available “off the shelf” and one that involved much more coding.

Step 1-a: einkbro

i found out that there is a fantastic FOSS browser, EinkBro, that is designed for eink screen devices, but works very well for any Android device. It is slick, fast, configurable and well designed. It implements the readibility library from mozilla, which is great, and, more than anything else, can directly export webpages as epub files. You can configure the toolbar so that the “export to epub” icon is directly visible. The exported epub is nice, looks like the “readibility” version of the webpage (probably because it is…). So, when I want to save a article I share it from my browesr to einkbro, and, from there, I export it to epub.

Step 1-b: Termux + readiblity scrape + pandoc

For this one I went all-in the rabbit hole of total control… Or maybe I could have done worse. Anyway, here are the components:

  • Termux: a terminal emulator for android, that allows you to do almost whatev you can do in a terminal emulator on a full blown Linux machine
  • Readability scrape is a command line tool that scrpaes an url and returns a simplified version of it, using the readability library from Mozilla (as in the read-mode from Firefox)
  • Pandoc is a command line tool that can convert documentation from one format to another, like, in our case, html to epub

I won’t go into the details , of how to install what. In case, just ask.

I setup termux so that, if i share a webpage to termux via Andorid share menu, it triggers the following script ~/bin/termux-url-opener (see this webpage to understand how termux handles shared URLs):

termux-toast "termux received $1" # toast message to war that the url was received

termux-chroot "~/scripts/webpage_to_epub.sh" $1 

note: for some reasons pandoc works as intended only if executed in chroot, so that’s why the follwing script is launched as from the command termux-chroot in the snippet above

webpage_to_epub.sh

#!/bin/bash

# final desitnation of epub file
FINAL_DIR="~/storage/shared/Documents/epub_articles/"

# Check if the URL argument is provided
if [ "$#" -ne 1 ]; then
  echo "Usage: $0 <URL>"
  exit 1
fi

URL="$1"
JSON_OUTPUT=$(readability-scrape --json "$URL")

# Check if the readability command was successful
if [ $? -ne 0 ]; then
  echo "Error: Failed to scrape URL."
  exit 1
else
  echo "readibility scrape: SUCCESS!!"	
fi

# Extract title and content using jq
TITLE=$(echo "$JSON_OUTPUT" | jq -r '.title')
CONTENT=$(echo "$JSON_OUTPUT" | jq -r '.content')
AUTHOR=$(echo "$JSON_OUTPUT" | jq -r '.byline')
CONTENT_LENGTH=$(echo "$JSON_OUTPUT" | jq -r '.length')  # Length in characters

# Calculate reading times based on character length
# Convert characters to words (approximate)
WORDS=$(($CONTENT_LENGTH / 5))

# Calculate reading times based on two speeds (200 and 300 words per minute)
READING_TIME_LOW=$(($WORDS / 300))  # For 300 wpm
READING_TIME_HIGH=$(($WORDS / 200))  # For 200 wpm

# Format the output for reading time
if [ "$READING_TIME_LOW" -eq "$READING_TIME_HIGH" ]; then
  READING_TIME="${READING_TIME_LOW} minutes"
else
  READING_TIME="${READING_TIME_LOW} - ${READING_TIME_HIGH} minutes"
fi

# Output the estimated reading time
echo "Estimated reading time: $READING_TIME"

# Format the current date in ISO format (YYYY-MM-DD)
CURRENT_DATE=$(date +"%Y-%m-%d")

# Remove accent characters and sanitize the title to create a valid filename
SANITIZED_TITLE=$(echo "$TITLE" | iconv -f UTF-8 -t ASCII//TRANSLIT | tr -cd '[:alnum:]_ ')  # Convert to ASCII and keep alphanumeric characters
SANITIZED_TITLE="${SANITIZED_TITLE// /_}"  # Replace spaces with underscores

# Create the final filename with date prefix
EPUB_FILE="${CURRENT_DATE}_${SANITIZED_TITLE}.epub"

# Create a temporary HTML file
HTML_FILE=$(mktemp /tmp/readability_output.XXXXXX.html)

# Write the complete HTML output
cat <<EOT > "$HTML_FILE"
<html>
<head>
  <title>$TITLE</title>
</head>
<body>
  <h1>$TITLE</h1>
    <div>
    $READING_TIME | <a href="$URL">original link</a>
  </div>
  <hr />
  $CONTENT
</body>
</html>
EOT

# Create a temporary title file for metadata
TITLE_FILE=$(mktemp /tmp/title.XXXXXXXXX.txt)

# Write the Pandoc YAML metadata block
cat <<EOT > "$TITLE_FILE"
---
title: "$TITLE"
author: "$AUTHOR"
EOT

# Convert the HTML file to EPUB including the metadata
#pandoc "$TITLE_FILE" "$HTML_FILE" -o "$EPUB_FILE"
pandoc "$HTML_FILE" -o "$EPUB_FILE"

# Check if pandoc command was successful
if [ $? -eq 0 ]; then
  echo "EPUB generated: $EPUB_FILE"
  mv "$EPUB_FILE" ~/storage/shared/Documents/epub_articles
else
  echo "Error: Failed to generate EPUB."
fi

# Clean up temporary file
rm "$HTML_FILE"

read -p "Press [Enter] key to continue..."

I spent time to craft the script to produce an output that I like, but, honestly, it’s not better than the one produced by einkbro in Step1-a. The advantage with the termux script is that it is a one click process. I share the link to termux, and the script generates the epub and saves to a folder that is setup in the next step to do the uplaod automatically

Step 2: send the epub to my kobo

Again also for step 2 i found 2 alternatives, one more “manual” and direct, and the second more automatic

Step 2-a: share to http

For this I use a simple app, share via http: I share the epub file via android share menu to this app. The app generates a mini web server at my local IP address (on the wifi, that can also be the one from android hotspot). I then use the kobo browser to the local address. The browser asks if you want to download the file. Once downloaded the file is added to the kobo ebooks.

By using Nickelmenu I added a shortcut to the kobomenu to start the browser, to make things faster.

This is the simplest solution, everything work locally, no third party involved

Step 2-b

As an alternative I setup a nextcloud sync.

  • On android I setup the folder where I save epubs as “automatic upload”, so epub files are uploaded to a folder on my nextcloud as soon as I asve them
  • On kobo I setup nexcloud syncronization. There is more than one alternative, I used this one. Whenever I connect my kobo to wifi, the new epubs are downloaded to my kobo and added to the library. The only downside is that to delete an article, I have to delete form the nexcloud foder; if I delete it from my kobo, it gets re-added as soon as I connect the wifi

Conclusions

Maybe this looks too complex, but I learned a lot of stuff and had fun in the process. i find that pandoc is probably a bit too much for what it is needed here, in the end the epub content is a bundle of html and images, probably there is a better and slicker way to package them. If you have any suggestion to improve the workflow it is welcome :-)

What do you use these days?

  • lgsp@feddit.it@feddit.itOP
    link
    fedilink
    arrow-up
    1
    ·
    19 days ago

    sembra carino, ma manca completamente di documentazione. Sembra che il tutto avvenga via javascript, quindi funziona in locale, giusto?