Frans Pop death: a pre-planned Debian-Day suicide that Debian kept hidden for 12 years

Frans Pop, Debian Day, Suicide

Feeds

August 10, 2022

Russell Coker

TSIG Error From SSSD

A common error when using the sssd daemon to authenticate via Active Directory on Linux seems to be:

sssd[$PID]: ; TSIG error with server: tsig verify failure

This is from sssd launching the command “nsupdate -g” to do dynamic DNS updates. It is possible to specify the DNS server in /etc/sssd/sssd.conf but that will only be used AFTER the default servers have been attempted, so it seems impossible to stop this error from happening. It doesn’t appear to do any harm as the correct server is discovered and used eventually. The commands piped to the nsupdate command will be something like:

server $SERVERIP
realm $DOMAIN
update delete $HOSTNAME.$DOMAIN. in A
update add $HOSTNAME.$DOMAIN. 3600 in A $HOSTIP
send
update delete $HOSTNAME.$DOMAIN. in AAAA
send

10 August, 2022 03:03AM by etbe

August 08, 2022

hackergotchi for Bits from Debian

Bits from Debian

Debian Day 2022 - call for celebration

Every year on August 16th, the anniversary of the Debian Project takes place. And several communities around the world celebrate this date by organizing local meetings in an event called "Debian Day".

So, how about celebrating the 29th anniversary of the Debian Project in 2022 in your city?

We invite you and your local community to organize Debian Day by hosting an event with talks, workshops, bug squashing party, OpenPGP keysigning, etc. Or simply holding a meeting between people who like Debian in a bar/pizzeria/cafeteria/restaurant to celebrate. In other words, any type of meeting is valid!

But remember that the COVID-19 pandemic is not over yet, so take all necessary measures to protect attendees.

As the 16th of August falls on a Tuesday, if you think it's better to organize it during the weekend, no problem. The importance is to celebrate the Debian Project.

Remember to add your city to the Debian Day wiki page

There is a list of Debian Local Groups around the world. If your city is listed, talk to them to organize DebianDay together.

There is a list of Debian Local Groups around the world. If your city is listed, talk to them to organized the Debian Day together.

Let's use hashtags #DebianDay #DebianDay2022 on social media.

08 August, 2022 03:00PM by The Debian Publicity Team

hackergotchi for Adnan Hodzic

Adnan Hodzic

atuf.app: Amsterdam Toilet & Urinal Finder

Amsterdam is a great city to enjoy a beer, or two. However, after you had your beers and you’re strolling down the streets, you might...

The post atuf.app: Amsterdam Toilet & Urinal Finder appeared first on FoolControl: Phear the penguin.

08 August, 2022 09:48AM by Adnan Hodzic

Ian Jackson

dkim-rotate - rotation and revocation of DKIM signing keys

Background

Internet email is becoming more reliant on DKIM, a scheme for having mail servers cryptographically sign emails. The Big Email providers have started silently spambinning messages that lack either DKIM signatures, or SPF. DKIM is arguably less broken than SPF, so I wanted to deploy it.

But it has a problem: if done in a naive way, it makes all your emails non-repudiable, forever. This is not really a desirable property - at least, not desirable for you, although it can be nice for someone who (for example) gets hold of leaked messages obtained by hacking mailboxes.

This problem was described at some length in Matthew Green’s article Ok Google: please publish your DKIM secret keys. Following links from that article does get you to a short script to achieve key rotation but it had a number of problems, and wasn’t useable in my context.

dkim-rotate

So I have written my own software for rotating and revoking DKIM keys: dkim-rotate.

I think it is a good solution to this problem, and it ought to be deployable in many contexts (and readily adaptable to those it doesn’t already support).

Here’s the feature list taken from the README:

  • Leaked emails become unattestable (plausibily deniable) within a few days — soon after the configured maximum email propagation time.

  • Mail domain DNS configuration can be static, and separated from operational DKIM key rotation. Domain owner delegates DKIM configuration to mailserver administrator, so that dkim-rotate does not need to edit your mail domain’s zone.

  • When a single mail server handles multiple mail domains, only a single dkim-rotate instance is needed.

  • Supports situations where multiple mail servers may originate mails for a single mail domain.

  • DNS zonefile remains small; old keys are published via a webserver, rather than DNS.

  • Supports runtime (post-deployment) changes to tuning parameters and configuration settings. Careful handling of errors and out-of-course situations.

  • Convenient outputs: a standard DNS zonefile; easily parseable settings for the MTA; and, a directory of old keys directly publishable by a webserver.

Complications

It seems like it should be a simple problem. Keep N keys, and every day (or whatever), generate and start using a new key, and deliberately leak the oldest private key.

But, things are more complicated than that. Considerably more complicated, as it turns out.

I didn’t want the DKIM key rotation software to have to edit the actual DNS zones for each relevant mail domain. That would tightly entangle the mail server administration with the DNS administration, and there are many contexts (including many of mine) where these roles are separated.

The solution is to use DNS aliases (CNAME). But, now we need a fixed, relatively small, set of CNAME records for each mail domain. That means a fixed, relatively small set of key identifiers (“selectors” in DKIM terminology), which must be used in rotation.

We don’t want the private keys to be published via the DNS because that makes an ever-growing DNS zone, which isn’t great for performance; and, because we want to place barriers in the way of processes which might enumerate the set of keys we use (and the set of keys we have leaked) and keep records of what status each key had when. So we need a separate publication channel - for which a webserver was the obvious answer.

We want the private keys to be readily noticeable and findable by someone who is verifying an alleged leaked email dump, but to be hard to enumerate. (One part of the strategy for this is to leave a note about it, with the prospective private key url, in the email headers.)

The key rotation operations are more complicated than first appears, too. The short summary, above, neglects to consider the fact that DNS updates have a nonzero propagation time: if you change the DNS, not everyone on the Internet will experience the change immediately. So as well as a timeout for how long it might take an email to be delivered (ie, how long the DKIM signature remains valid), there is also a timeout for how long to wait after updating the DNS, before relying on everyone having got the memo. (This same timeout applies both before starting to sign emails with a new key, and before deliberately compromising a key which has been withdrawn and deadvertised.)

Updating the DNS, and the MTA configuration, are fallible operations. So we need to cope with out-of-course situations, where a previous DNS or MTA update failed. In that case, we need to retry the failed update, and not proceed with key rotation. We mustn’t start the timer for the key rotation until the update has been implemented.

The rotation script will usually be run by cron, but it might be run by hand, and when it is run by hand it ought not to “jump the gun” and do anything “too early” (ie, before the relevant timeout has expired). cron jobs don’t always run, and don’t always run at precisely the right time. (And there’s daylight saving time, to consider, too.)

So overall, it’s not sufficient to drive the system via cron and have it proceed by one unit of rotation on each run.

And, hardest of all, I wanted to support post-deployment configuration changes, while continuing to keep the whole the system operational. Otherwise, you have to bake in all the timing parameters right at the beginning and can’t change anything ever. So for example, I wanted to be able to change the email and DNS propagation delays, and even the number of selectors to use, without adversely affecting the delivery of already-sent emails, and without having to shut anything down.

I think I have solved these problems.

The resulting system is one which keeps track of the timing constraints, and the next event which might occur, on a per-key basis. It calculates on each run, which key(s) can be advanced to the next stage of their lifecycle, and performs the appropriate operations. The regular key update schedule is then an emergent property of the config parameters and cron job schedule. (I provide some example config.)

Exim

Integrating dkim-rotate itself with Exim was fairly easy. The lsearch lookup function can be used to fish information out of a suitable data file maintained by dkim-rotate.

But a final awkwardness was getting Exim to make the right DKIM signatures, at the right time.

When making a DKIM signature, one must choose a signing authority domain name: who should we claim to be? (This is the “SDID” in DKIM terms.) A mailserver that handles many different mail domains will be able to make good signatures on behalf of many of them. It seems to me that domain to be the mail domain in the From: header of the email. (The RFC doesn’t seem to be clear on what is expected.) Exim doesn’t seem to have anything builtin to do that.

And, you only want to DKIM-sign emails that are originated locally or from trustworthy sources. You don’t want to DKIM-sign messages that you received from the global Internet, and are sending out again (eg because of an email alias or mailing list). In theory if you verify DKIM on all incoming emails, you could avoid being fooled into signing bad emails, but rejecting all non-DKIM-verified email would be a very strong policy decision. Again, Exim doesn’t seem to have cooked machinery.

The resulting Exim configuration parameters run to 22 lines, and because they’re parameters to an existing config item (the smtp transport) they can’t even easily be deployed as a drop-in file via Debian’s “split config” Exim configuration scheme.

(I don’t know if the file written for Exim’s use by dkim-rotate would be suitable for other MTAs, but this part of dkim-rotate could easily be extended.)

Conclusion

I have today released dkim-rotate 0.4, which is the first public release for general use.

I have it deployed and working, but it’s new so there may well be bugs to work out.

If you would like to try it out, you can get it via git from Debian Salsa. (Debian folks can also find it freshly in Debian unstable.)



comment count unavailable comments

08 August, 2022 12:20AM

August 07, 2022

Debian Suicide FYI

Lucas Nussbaum & Debian attempted exploit of OVH Hosting insider

Lucas Nussbaum, Debian, Université de Lorraine

When Debian cabalists wanted to steal the domain debian-multimedia.org in 2014, they didn't go to a lawyer or the World Intellectual Property Organization.

The Debian Project Leader (DPL), Lucas Nussbaum, who is a professor at Université de Lorraine, France, relied on another Debian Developer to tap the shoulder of an insider at OVH, which is also a French company, to see if the domain registration could be hijacked covertly.

According to the email below, OVH managers didn't want to get involved in Debian dirty politics.

This incident demonstrates the way that many Debian cabal members have a second loyalty. Some of them siphon off time from their job, some of them siphon off information and they plot against real developers on the debian-private (leaked) gossip network.

When we talk about an open source "community" or a Debian "community", what we are really talking about is a revolving door that provides a risk of legal consequences, ethical traps and inadequate barriers between competing enterprises.

Subject: status of DebIan-multimedia.org
Date: Fri, 3 May 2013 16:18:41 +0200
From: Lucas Nussbaum <leader@DebIan.org>
To: DebIan-private@lists.DebIan.org

[ Why is this on -private@?" at the end of this mail ]

Hi,

There has been questions/rumors about the state of
DebIan-multimedia.org, and the ``interesting" content provided on
http://DebIan-multimedia.org/. I'll try to summarize what I know.

[redacted / trimmed]

2013-04:
Private discussions about trying to register the domain ourselves
(including a contact with OVH, as a DD knows someone in OVH management,
but OVH declined transferring the domain as it would conflict with the
ICANN procedures.
OVH Groupe SAS, Debian

07 August, 2022 07:00PM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RApiSerialize 0.1.1 on CRAN: Updates

A new release 0.1.1 of RApiSerialize is now on CRAN. While this is the first release in seven years (!!), it brings mostly minor internal updates along with the option of using serialization format 3.

The package is used by both my RcppRedis as well as by Travers excellent qs package. Neither one of us has a need to switch to format 3 yet so format 2 remains the default. But along with other standard updates to package internals, it was straightforward to offer the newer format so that is what we did.

Changes in version 0.1.1 (2022-08-07)

  • Updated CI use to r-ci

  • Expanded and updated both DESCRIPTION and README.md

  • Updated package internals to register compiled functions

  • Add support for serialization format 3, default remains 2

  • Minor synchronization with upstream

Courtesy of my CRANberries, there is also a diffstat to the previous version. More details are at the RApiSerialize page; code, issue tickets etc at the GitHub repository.

If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

07 August, 2022 04:38PM

August 06, 2022

RcppCCTZ 0.2.11 on CRAN: Updates

A new release 0.2.11 of RcppCCTZ is now on CRAN.

RcppCCTZ uses Rcpp to bring CCTZ to R. CCTZ is a C++ library for translating between absolute and civil times using the rules of a time zone. In fact, it is two libraries. One for dealing with civil time: human-readable dates and times, and one for converting between between absolute and civil times via time zones. And while CCTZ is made by Google(rs), it is not an official Google product. The RcppCCTZ page has a few usage examples and details. This package was the first CRAN package to use CCTZ; by now four others packages include its sources too. Not ideal, but beyond our control.

This version updates the include headers used in the interface API header thanks to a PR by Jing Lu, updates to upstream changes, and switched r-ci CI to r2u.

Changes in version 0.2.11 (2022-08-06)

  • More specific includes in RcppCCTZ_API.h (Jing Lu in #42 closing #41).

  • Synchronized with upstream CCTZ (Dirk in #43).

  • Switched r-ci to r2u use.

Courtesy of my CRANberries, there is also a diffstat to the previous version. More details are at the RcppCCTZ page; code, issue tickets etc at the GitHub repository.

If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

06 August, 2022 06:52PM

August 05, 2022

Thorsten Alteholz

My Debian Activities in July 2022

FTP master

This month I accepted 420 and rejected 44 packages. The overall number of packages that got accepted was 422.

I am sad to write the following lines, but unfortunately there are people who rather take advantage of others instead of doing a proper maintenance of their packages.

So, in order to find time slots for as much packages in NEW as possible, I no longer write a debian/copyright for others. I know it is a boring task to collect the copyright information, but our policy still requires this. Of course nobody is perfect and certainly one or the other license or copyright holder can be overlooked. Luckily most of the contributors maintain their debian/copyright very thouroughly with a terrific result.

On the other hand some contributors upload only some crap and demand that I exactly list what is missing. I am no longer willing to do this. I am going to stop processing after I found a few missing things and reject the package. When I see repeatedly uploads containing only improvements with things I pointed out, I will process this package only after all others from NEW are done.

Debian LTS

This was my ninety-seventh month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian.

This month my all in all workload has been 35.75h. Unfortunately Stretch LTS has moved to Stretch ELTS and Buster LTS was not yet opened in July. So I think this is the first month I did not work all assigned hours.

Besides things on security-master, I only worked 20h on moving the LTS documentation to their new destination. At the moment the documentation is spread over several locations. As searching over all those locations is not possible, it shall be collected at one place.

Debian ELTS

This month was the forty-eighth ELTS month.

During my allocated time I uploaded:

  • [ELA-643-1] for ncurses (5.9+20140913-1+deb8u4, 6.0+20161126-1+deb9u3)
  • [ELA-655-1] for libhttp-daemon-perl (6.01-1+deb8u1, 6.01-1+deb9u1)
  • [6.14-1.1] upload to unstable
  • [#1016391] bullseye-pu: libhttp-daemon-perl/6.12-1+deb11u1

I also started to work on mod-wsgi and my patch was already approved by the maintainer. Now I am waiting for the security team to decide whether it will be uploaded as DSA or via PU.

Last but not least I did some days of frontdesk duties.

Other stuff

This month I uploaded new upstream versions or improved packaging of:

05 August, 2022 03:43PM by alteholz

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RcppXts 0.0.5 on CRAN: Routine Refreshment

A full eight and half years (!!) since its 0.0.4 release, version 0.0.5 of RcppXts is now on CRAN. The RcppXts package demonstrates how to access the export C API of xts which we contributed a looong time ago.

This release contains an accumulated small set of updates made as the CRAN Policies evolved. We now register and use the shared library routines (updates in both src/init.c and NAMESPACE), turned on continuous integration, switched it from the now disgraces service to another, adopted our portable r-ci along with r2, added badges to the README.md, updated to https URLs, and made sure the methods package (from base R) was actually imported (something Rcpp has a need for at startup). That latter part now triggered a recent email from the CRAN maintainers which prompted this release.

The NEWS entries follow.

Changes in version 0.0.5 (2022-08-05)

  • Depends on xts 0.9-6 now on CRAN

  • Exports (and documents) a number of additional functions

  • Switch CI use to r-ci and r2u

  • README.md, DESCRIPTION and NEWS.Rd were updated and expanded

  • NAMESPACE import of the shared library uses registration

Courtesy of my CRANberries, there is also a diffstat report for this release. A bit more information about the package is available here as well as at the GitHub repo.

If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

05 August, 2022 01:07PM

August 04, 2022

Jamie McClelland

Fine tuning Thunderbird's end-to-end encryption

I love that Thunderbird really tackled OpenPGP head on and incorporated it directly into the client. I know it’s been a bit rough for some users, but I think it’s a good long term investment.

And to demonstrate I’ll now complain about a minor issue :).

I replied to an encrypted message but couldn’t send the response using encryption. I got an error message indicating that “End-to-end encryption requires resolving certificate issues for” … and it listed the recipient email address.

&ldquo;Screen shot of error message saying: End-to-end encryption requires resolving certificate issues for [blacked out email address]&rdquo;

I spent an enormous amount of time examining the recipient’s OpenPGP key. I made sure it was not expired. I made sure it was actually in my Thunderbird key store not just in my OpenPGP keychain. I made sure I had indicated that I trust it enough to use. I re-downloaded it.

I eventually gave up and didn’t send the email. Then I responded to another encrypted email and it worked. What!?!?

I spent more time comparing the recipients before I realized the problem was the sending address, not the recipient address.

I have an OpenPGP key that lists several identities. I have a Thunderbird Account that uses the Identities feature to add several from addresses. And, it turns out that in Thunderbird, you need to indicate which OpenPGP key to use for your main account… but also for each identity. When you drill down to Manage Identities for your account, you are able to indicate which OpenPGP key you want to use for each identity. Once I indicated that each identity should use my OpenPGP key, the issue was resolved.

And here’s my Thunderbird bug asking for an error message pointing to the sender address, not the recipient address.

04 August, 2022 10:27PM

Reproducible Builds

Reproducible Builds in July 2022

Welcome to the July 2022 report from the Reproducible Builds project!

In our reports we attempt to outline the most relevant things that have been going on in the past month. As a brief introduction, the reproducible builds effort is concerned with ensuring no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.


Reproducible Builds summit 2022

Despite several delays, we are pleased to announce that registration is open for our in-person summit this year:

November 1st → November 3rd

The event will happen in Venice (Italy). We intend to pick a venue reachable via the train station and an international airport. However, the precise venue will depend on the number of attendees.

Please see the announcement email for information about how to register.


Is reproducibility practical?

Ludovic Courtès published an informative blog post this month asking the important question: Is reproducibility practical?:

Our attention was recently caught by a nice slide deck on the methods and tools for reproducible research in the R programming language. Among those, the talk mentions Guix, stating that it is “for professional, sensitive applications that require ultimate reproducibility”, which is “probably a bit overkill for Reproducible Research”. While we were flattered to see Guix suggested as good tool for reproducibility, the very notion that there’s a kind of “reproducibility” that is “ultimate” and, essentially, impractical, is something that left us wondering: What kind of reproducibility do scientists need, if not the “ultimate” kind? Is “reproducibility” practical at all, or is it more of a horizon?

The post goes on to outlines the concept of reproducibility, situating examples within the context of the GNU Guix operating system.


diffoscope

diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 218, 219 and 220 to Debian, as well as made the following changes:

  • New features:

  • Bug fixes:

    • Fix a regression introduced in version 207 where diffoscope would crash if one directory contained a directory that wasn’t in the other. Thanks to Alderico Gallo for the testcase. []
    • Don’t traceback if we encounter an invalid Unicode character in Haskell versioning headers. []
  • Output improvements:

  • Codebase improvements:

    • Space out a file a little. []
    • Update various copyright years. []


Mailing list

On our mailing list this month:

  • Roland Clobus posted his Eleventh status update about reproducible [Debian] live-build ISO images, noting — amongst many other things! — that “all major desktops build reproducibly with bullseye, bookworm and sid.”

  • Santiago Torres-Arias announced a Call for Papers (CfP) for a new SCORED conference, an “academic workshop around software supply chain security”. As Santiago highlights, this new conference “invites reviewers from industry, open source, governement and academia to review the papers [and] I think that this is super important to tackle the supply chain security task”.


Upstream patches

The Reproducible Builds project attempts to fix as many currently-unreproducible packages as possible. This month, however, we submitted the following patches:


Reprotest

reprotest is the Reproducible Builds project’s end-user tool to build the same source code twice in widely and deliberate different environments, and checking whether the binaries produced by the builds have any differences. This month, the following changes were made:

  • Holger Levsen:

    • Uploaded version 0.7.21 to Debian unstable as well as mark 0.7.22 development in the repository [].
    • Make diffoscope dependency unversioned as the required version is met even in Debian buster. []
    • Revert an accidentally committed hunk. []
  • Mattia Rizzolo:

    • Apply a patch from Nick Rosbrook to not force the tests to run only against Python 3.9. []
    • Run the tests through pybuild in order to run them against all supported Python 3.x versions. []
    • Fix a deprecation warning in the setup.cfg file. []
    • Close a new Debian bug. []


Reproducible builds website

A number of changes were made to the Reproducible Builds website and documentation this month, including:

  • Arnout Engelen:

  • Chris Lamb:

    • Correct some grammar. []
  • Holger Levsen:

    • Add talk from FOSDEM 2015 presented by Holger and Lunar. []
    • Show date of presentations if we have them. [][]
    • Add my presentation from DebConf22 [] and from Debian Reunion Hamburg 2022 [].
    • Add dhole to the speakers of the DebConf15 talk. []
    • Add raboof’s talk “Reproducible Builds for Trustworthy Binaries” from May Contain Hackers. []
    • Drop some Debian-related suggested ideas which are not really relevant anymore. []
    • Add a link to list of packages with patches ready to be NMUed. []
  • Mattia Rizzolo:

    • Add information about our upcoming event in Venice. [][][][]


Testing framework

The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, Holger Levsen made the following changes:

  • Debian-related changes:

    • Create graphs displaying existing .buildinfo files per each Debian suite/arch. [][]
    • Fix a typo in the Debian dashboard. [][]
    • Fix some issues in the pkg-r package set definition. [][][]
    • Improve the “builtin-pho” HTML output. [][][][]
    • Temporarily disable all live builds as our snapshot mirror is offline. []
  • Automated node health checks:

    • Detect dpkg failures. []
    • Detect files with bad UNIX permissions. []
    • Relax a regular expression in order to detect Debian Live image build failures. []
  • Misc changes:

    • Test that FreeBSD virtual machine has been updated to version 13.1. []
    • Add a reminder about powercycling the armhf-architecture mst0X node. []
    • Fix a number of typos. [][]
    • Update documentation. [][]
    • Fix Munin monitoring configuration for some nodes. []
    • Fix the static IP address for a node. []

In addition, Vagrant Cascadian updated host keys for the cbxi4pro0 and wbq0 nodes [] and, finally, node maintenance was also performed by Mattia Rizzolo [] and Holger Levsen [][][].


Contact

As ever, if you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

04 August, 2022 03:35PM

Abhijith PA

Trip to misty mountains in Munnar

Munnar is a hill station in Idukki district of Kerala, India. Home to 2nd largest tea plantation in the country. Lot of people visit here on summer and in winter as well. I live in the neighboring district of Munnar though I never made a visit. In my mind I pictured Munnar as a Tourist trap with lots of garbage lying around.

I recently made a visit and it changed my perception of this place.

Munnar!

Little background

I never liked tea much. I am also not a coffee person either. But if I have to choose over two that will be coffee because of the strong aroma. Going to relatives house, they always offered hot tea defacto. I always find difficult say ‘no’ to their friendly gesture. But I hate tea.

A generation before me drinks lot of tea here at my place. You can see tea stalls in every corner and people sipping tea. I always wondered why people drink lot of tea on a hot country like India.

The book I am currently trying to read has a chapter about Munnar and how it became a Tea plantation under the British rule. Well, around the same time. I watched a documentary program about the tea and Munnar.

Munnar

Munnar on early evening

Too much word here and there I decided to do a visit. I took a motorbike and started a journey to Munnar. Due to covid restrictions there weren’t much tourists, so this was to my advantage. There are many water falls on the way to Munnar. Some are very close to road and some are far away but can be spotted. Munnar travel is just not about the destination because its never been a single spot. Enjoying the journey that the ride has to offer.

I stayed at a hotel, little far away from town, though I never recommend hotels in Munnar. Try to find home stays and small establishments away from the town. There are British Era bungalows inside the plantations still maintained in good condition which can be booked per room or entire property.

The lush greenery on the Mountains of tea plantation is very refreshing and feast to our eyes. The mornings and evenings of Munnar is something to watch, mountains wrapped in mist slowly uncovering with sunlight and again slipping to mist by dark evening. I planned only to visit places which are less explored by tourists.

People here live a simple life. Most of them are plantation workers. The native people of Munnar are actually tribal folks but since the plantation boom many people from Tamil Nadu(neighboring state) and other parts of Kerala settled here. The houses of this plantation workers resembled Hobbit homes in Shire from Lord of the Rings as they are in the hill slides. The Kannan Devan hills, the biggest hill in area covers more than half of Munnar.

Hobbit homes

Two famous Tea companies from Munnar are Tata Tea and KDHP(Kanan Devan Hills Plantations Company (P) Limited) tea. KDHP is actually an employee owned Tea company ie a good share of this company is owned by the employees working there. This was interesting to me, so I bought a bag of speciality tea from KDHP store on my return. I don’t drink tea on a daily basis but I will try it on special occasions.

04 August, 2022 09:53AM

August 03, 2022

hackergotchi for Sean Whitton

Sean Whitton

Setting up a single-board desktop replacement with Consfigurator

The ThinkPad x220 that I had been using as an ssh terminal at home finally developed one too many hardware problems a few weeks ago, and so I ordered a Raspberry Pi 4b to replace it. Debian builds minimal SD card images for these machines already, but I wanted to use the usual ext4-on-LVM-on-LUKS setup for GNU/Linux workstations. So I used Consfigurator to build a custom image.

There are two key advantages to using Consfigurator to do something like this:

  1. As shown below, it doesn’t take a lot of code to define the host, it’s easily customisable without writing shell scripts, and it’s all declarative. (It’s quite a bit less code than Debian’s image-building scripts, though I haven’t carefully compared, and they are doing some additional setup beyond what’s shown below.)

  2. You can do nested block devices, as required for ext4-on-LVM-on-LUKS, without writing an intensely complex shell script to expand the root filesystem to fill the whole SD card on first boot. This is because Consfigurator can just as easily partition and install an actual SD card as it can write out a disk image, using the same host definition.

Consfigurator already had all the capabilities to do this, but as part of this project I did have to come up with the high-level wrapping API, which didn’t exist yet. My first SD card write wouldn’t boot because I had to learn more about kernel command lines; the second wouldn’t boot because of a minor bug in Consfigurator regarding /etc/crypttab; and the third build is the one I’m using, except that the first boot runs into a bug in cryptsetup-initramfs. So as far as Consfigurator is concerned I would like to claim that it worked on my second attempt, and had I not been using LUKS it would have worked on the first :)

The code

(defhost erebus.silentflame.com ()
  "Low powered home workstation in Tucson."
  (os:debian-stable "bullseye" :arm64)
  (timezone:configured "America/Phoenix")

  (user:has-account "spwhitton")
  (user:has-enabled-password "spwhitton")

  (disk:has-volumes
   (physical-disk
    (partitioned-volume

     ((partition
       :partition-typecode #x0700 :partition-bootable t :volume-size 512
       (fat32-filesystem :mount-point #P"/boot/firmware/"))

      (partition
       :volume-size :remaining

       (luks-container
        :volume-label "erebus_crypt"
        :cryptsetup-options '("--cipher" "xchacha20,aes-adiantum-plain64")

        (lvm-physical-volume :volume-group "vg_erebus"))))))

   (lvm-logical-volume
    :volume-group "vg_erebus"
    :volume-label "lv_erebus_root" :volume-size :remaining

    (ext4-filesystem :volume-label "erebus_root" :mount-point #P"/"
                     :mount-options '("noatime" "commit=120"))))

  (apt:installed "linux-image-arm64" "initramfs-tools"
                 "raspi-firmware" "firmware-brcm80211"
                 "cryptsetup" "cryptsetup-initramfs" "lvm2")
  (etc-default:contains "raspi-firmware"
                        "ROOTPART" "/dev/mapper/vg_erebus-lv_erebus_root"
                        "CONSOLES" "ttyS1,115200 tty0"))

and then you just insert the SD card and, at the REPL on your laptop,

CONSFIG> (hostdeploy-these laptop.example.com
           (disk:first-disk-installed-for nil erebus.silentflame.com #P"/dev/mmcblk0"))

There is more general information in the OS installation tutorial in the Consfigurator user’s manual.

Other niceties

  • Configuration management that’s just as easily applicable to OS installation as it is to the more usual configuration of hosts over SSH drastically improves the ratio of cost-to-benefit for including small customisations one is used to.

    For example, my standard Debian system configuration properties (omitted from the code above) meant that when I was dropped into an initramfs shell during my attempts to make an image that could boot itself, I found myself availed of my custom Space Cadet-inspired keyboard layout, without really having thought at any point “let’s do something to ensure I can have my usual layout while I’m figuring this out.” It was just included along with everything else.

  • As compared with the ThinkPad x220, it’s nice how the Raspberry Pi 4b is silent and doesn’t have any LEDs lit by default once it’s booted. A quirk of my room is that one plug socket is controlled by a switch right next to the switch for the ceiling light, so I’ve plugged my monitor into that outlet. Then when I’ve finished using the new machine I can flick that switch and the desk becomes completely silent and dark, without actually having to suspend the machine to RAM, thereby stopping cron jobs, preventing remote access from the office to fetch uncommitted files, etc..

03 August, 2022 12:07AM

August 02, 2022

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

AV1 live streaming: The bitrate difference

As part of looking into AV1, I wanted to get a feel for what kind of bitrate to aim for. Of course, Intel and others have made exquisite graphs containing results for many different encoders (although they might have wanted to spend a little more pixels on that latest one…), but you always feel suspicious that the results might be a bit cherry-picked. In particular, SVT-AV1 always seems to run on pretty wide machines (in this case, 48 cores/96 threads), and I wondered whether this was the primary reason for it doing so ell.

So I made my own test, with the kind of footage I care about (a sports clip) at the speeds that I care about (realtime). I intentionally restricted the encoders to 16 cores (no HT, since that was the easiest), and tried various bitrates until I hit VMAF 85, which felt like a reasonable target. Of course, you don't always hit exactly 85.000 at any bit rate, VMAF is not a perfect measure, encoders encode for other metrics than VMAF, etc.…, but it's in the right ballpark. (Do note, though, that since I don't enable CBR but let both encoders work pretty freely within their 1-pass VBR, SVT-AV1's rate control issues won't really show up here. I consider that a feature of this graph, really, not a bug. But keep it in mind.)

Without further ado, the results:

x264 vs. SVT-AV1

There are two things that stick out immediately.

First, SVT-AV1 is just crushing x264. This is really impressive, given where it was a couple of years ago. Preset 11 is faster than x264 “faster”, and nearly half the bitrate (~55%) of x264 “slower”! (ultrafast and superfast are still somehow relevant, but only barely.) It seems we finally, nearly 20 years after the publication of H.264, have a codec that delivers the mythical bitrate halving. (At the “high” end, SVT-AV1 preset 8 uses 38% the bitrate of x264 “slower”, while being 23% faster. That's 2.58x the density at same VMAF.) Note that these cores are pretty slow; if you have 16 fast cores, you can go way past the 1.0 mark. For instance, my 5950X just barely reaches realtime speeds on preset 7.

Second, the difference between 8-bit and 10-bit is surprisingly high performance-wise (given Intel's claims that it shouldn't be too bad), and surprisingly low quality-wise. I've mulled a fair bit over this, given that earlier x264 SSIM-based tests showed a huge (20–30%) difference, and my conclusion is that it's probably a combination of VMAF not being too sensitive to the relevant artifacts (e.g. it's not very sensitive to banding), and the clip itself not really requiring it. I see a somewhat larger difference with libaom, and a somewhat larger difference at higher VMAFs, but in general, it seems you can forego 10-bit if you are e.g. worried about decoder performance. It's probably the safe option, though.

Again: This is just a spot check. But it seems to validate Intel's general gist; SVT-AV1 does really well for realtime compared to the old king x264. Even on lower-end servers/high-end desktops.

02 August, 2022 10:11PM

Debian Suicide FYI

URGENT: debian.community domain seized, use new feed URLs, 7000 new leaked emails

URGENT: our domain has been stolen and will stop working on 29 July 2022

Please update your browser home page, bookmarks and links from any pages you control.

Use the new URL https://Suicide.FYI where we are continuing to document the Debian / Ubuntu culture and related suicides

You can find the new RSS and Atom feed URLs on the same page. As the debian.community domain has been stolen, we can not rely on a redirect.

New debian-private leaks for July and August 2022: the DebConf room-sharing lists are coming. See the conflicts of interest in Debian cabals bedroom-by-bedroom, bed-by-bed. Read it safely online without the grunts and odours of DebConf dormitories. These will be available on the new URL, https://Suicide.FYI

Other new leaks include another 7000 emails from debian-private. Please make sure you follow our new URL.

https://Suicide.FYI

We note that the Debian gods have only tried to steal the domain, they have not made any effort to dispute the evidence we publish about mentor/intern romances in Google Summer of Code (GSoC), Outreachy. Thanks to this legal process at WIPO, a lot more people are now coming to look at the web site.

If you rely on Debian and if you need to know the truth about modern slavery, people trafficking and integrity in open source, please follow and share the new URLs.

Photo Copyright (C) Andis Rado, Open Labs hackerspace, Tirana, Albania, copied from Wikimedia Commons

02 August, 2022 06:30AM

August 01, 2022

Death of Dr Alex Blewitt, UK

Debian Developers are shocked to hear about the recent death of Dr Alex Blewitt in July 2022. Dr Blewitt was well known for his work around Eclipse and as the Head of Cloud at the Santander bank.

The InfoQ web site, where he wrote many articles as an editor has published an obituary:

It is with great sadness that we announce that InfoQ editor Dr. Alex Blewitt has unexpectedly passed away. Alex was a well-known writer within the technology space, Java Champion, Eclipse expert, meetup organizer, and more. He was a frequent presenter and attendee at conferences held around the globe, and he will be deeply missed within the InfoQ community.

Dr Blewitt was an early contributor to Debian. Dr Blewitt is one of those early contributors we were thinking about when responding to the WIPO harassment. Dr Blewitt was here before the trademark. Dr Blewitt was here before we were hoodwinked with the constitution. Dr Blewitt was here before the Code of Conduct (CoC) created the toxic environment of accusations and insults that is Debian today.

Bug report #2110 from 9 January 1996. That is before the BTS.

Bugs 85395 and 95835.

Dr Blewitt's name appears in the copyright notices for eclipse-egit and eclipse-jgit.

We have recently gone to great lengths to examine the suicide and subsequent cover-up involving Frans Pop in 2010. There is no evidence whatsoever about Dr Blewitt's cause of death and there is no evidence that he was recently engaged with Debianism. Nonetheless, each time there is another death in the open source communities, like the recent death of bullying victim Michael Bordlee in June, we feel compelled to begin examining the case.

If you have any news about deaths or related incidents of cyberbullying under the guise of CoC enforcement, please contact editor@suicide.fyi.

Dr Alex Blewitt, Santander, Eclipse, Milton Keynes, Debian, Obituary

01 August, 2022 09:30AM

Death of Michael Anthony Bordlee, New Orleans, Louisiana

On 17 June 2022, information was published for the first time confirming the suicide of Frans Pop and linking it to the history of cyberbullying Pop experienced in the Debian open source project.

Two days later, on 19 June 2022, an open-source VR artist, Michael Bordlee, passed away, age 29, in New Orleans. The cause of death has not been disclosed at this stage.

There is a tribute page for Michael. The cause of death is not mentioned.

We found that Michael had previously worked on a VR simulator to help people understand depression. There is a news article about it. The news article notes that Michael was himself a victim of bullying, much like Frans Pop. There is no statement connecting this to the death.

There have been a number of similar deaths in 2021, including the suspected suicide of 18 year old Redox OS developer Samuel (jd91mzm2 / LEGOlord208) and in Japan, the confirmed suicide of BSNES creator Near ( Byuu / Dave). Both Samuel and Near had described experiences of bullying.

Debian's Lucy Wayland had been exposed to the Debian Christmas lynchings shortly before her death in January 2019. This reminds us that witnesses to bullying can be affected as deeply as the victims. Another Debian Developer had even anticipated the risk of another suicide shortly before Wayland passed away.

Nobody listened.

The free software fellowship is now maintaining a list of all the open source deaths, including accidents and suicides. Please contribute if you have any more details about any of these cases.

Please email us if you have any evidence concerning Michael Bordlee or any other victims of bullying in the open source world.

Michael Anthony Bordlee

01 August, 2022 05:00AM

François Marier

Remote logging of Turris Omnia log messages using syslog-ng and rsyslog

As part of debugging an upstream connection problem I've been seeing recently, I wanted to be able to monitor the logs from my Turris Omnia router. Here's how I configured it to send its logs to a server I already had on the local network.

Server setup

The first thing I did was to open up my server's rsyslog (Debian's default syslog server) to remote connections since it's going to be the destination host for the router's log messages.

I added the following to /etc/rsyslog.d/router.conf:

module(load="imtcp")
input(type="imtcp" port="514")

if $fromhost-ip == '192.168.1.1' then {
    if $syslogseverity <= 5 then {
        action(type="omfile" file="/var/log/router.log")
    }
    stop
}

This is using the latest rsyslog configuration method: a handy scripting language called RainerScript. Severity level 5 maps to "notice" which consists of unusual non-error conditions, and 192.168.1.1 is of course the IP address of the router on the LAN side. With this, I'm directing all router log messages to a separate file, filtering out anything less important than severity 5.

In order for rsyslog to pick up this new configuration file, I restarted it:

systemctl restart rsyslog.service

and checked that it was running correctly (e.g. no syntax errors in the new config file) using:

systemctl status rsyslog.service

Since I added a new log file, I also setup log rotation for it by putting the following in /etc/logrotate.d/router:

/var/log/router.log
{
    rotate 4
    weekly
    missingok
    notifempty
    compress
    delaycompress
    sharedscripts
    postrotate
        /usr/lib/rsyslog/rsyslog-rotate
    endscript
}

In addition, since I use logcheck to monitor my server logs and email me errors, I had to add /var/log/router.log to /etc/logcheck/logcheck.logfiles.

Finally I opened the rsyslog port to the router in my server's firewall by adding the following to /etc/network/iptables.up.rules:

# Allow logs from the router
-A INPUT -s 192.168.1.1 -p tcp --dport 514 -j ACCEPT

and ran iptables-apply.

With all of this in place, it was time to get the router to send messages.

Router setup

As suggested on the Turris forum, I ssh'ed into my router and added this in /etc/syslog-ng.d/remote.conf:

destination d_loghost {
        network("192.168.1.200" time-zone("America/Vancouver"));
};

source dns {
        file("/var/log/resolver");
};

log {
        source(src);
        source(net);
        source(kernel);
        source(dns);
        destination(d_loghost);
};

Setting the timezone to the same as my server was needed because the router messages were otherwise sent with UTC timestamps.

To ensure that the destination host always gets the same IP address (192.168.1.200), I went to the advanced DHCP configuration page and added a static lease for the server's MAC address so that it always gets assigned 192.168.1.200. If that wasn't already the server's IP address, you'll have to restart it for this to take effect.

Finally, I restarted the syslog-ng daemon on the router to pick up the new config file:

/etc/init.d/syslog-ng restart

Testing

In order to test this configuration, I opened three terminal windows:

  1. tail -f /var/log/syslog on the server
  2. tail -f /var/log/router.log on the server
  3. tail -f /var/log/messages on the router

I immediately started to see messages from the router in the third window and some of these, not all because of my severity-5 filter, were flowing to the second window as well. Also important is that none of the messages make it to the first window, otherwise log messages from the router would be mixed in with the server's own logs. That's the purpose of the stop command in /etc/rsyslog.d/router.conf.

To force a log messages to be emitted by the router, simply ssh into it and issue the following command:

logger Test

It should show up in the second and third windows immediately if you've got everything setup correctly

01 August, 2022 02:59AM

hackergotchi for Junichi Uekawa

Junichi Uekawa

August.

August. I think I finally understood what's going on in io_uring.

01 August, 2022 12:07AM by Junichi Uekawa

July 31, 2022

Paul Wise

FLOSS Activities July 2022

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review

Administration

  • Debian BTS: unarchive/reopen/triage bugs for reintroduced packages
  • Debian servers: check full disks, ping users of excessive disk usage, restart a hung service
  • Debian wiki: approve accounts

Communication

  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors

The SPTAG, SIMDEverywhere, cwidget, aptitude, tldextract work was sponsored. All other work was done on a volunteer basis.

31 July, 2022 11:07PM

Russell Coker

hackergotchi for Joachim Breitner

Joachim Breitner

The Via Alpina red trail through Slovenia

This July my girlfriend and I hiked the Slovenian part of the Red Trail of the Via Alpina, from the edge of the Julian Alps to Trieste, and I’d like to share some observations and tips that we might have found useful before our trip.

Our most favorite camp spot
Our most favorite camp spot

Getting there

As we traveled with complete camping gear and wanted to stay in our tent, we avoided the high alpine parts of the trail and started just where the trail came down from the Alps and entered the Karst. A great way to get there is to take the night train from Zurich or Munich towards Ljubljana, get off at Jesenice, have breakfast, take the local train to Podbrdo and you can start your tour at 9:15am. From there you can reach the trail at Pedrovo Brdo within 1½h.

Finding the way

We did not use any paper maps, and instead relied on the OpenStreetMap data, which is very good, as well as the official(?) GPX tracks on Komoot, which are linked from the official route descriptions. We used OsmAnd.

In general, trails are generally very well marked (red circle with white center, and frequent signs), but the signs rarely tell you which way the Via Alpina goes, so the GPS was needed.

Sometimes the OpenStreetMap trail and the Komoot trail disagreed on short segments. We sometimes followed one and other times the other.

Variants

We diverged from the trail in a few places:

  • We did not care too much about the horses in Lipica and at least on the map it looked like a longish boringish and sun-exposed detour, so we cut the loop and hiked from Prelože pri Lokvi up onto the peak of the Veliko Gradišče (which unfortunately is too overgrown to provide a good view).

  • When we finally reached the top of Mali Kras and had a view across the bay of Trieste, it seemed silly to walk to down to Dolina, and instead we followed the ridge through Socerb, essentially the Alpe Adria Trail.

  • Not really a variant, but after arriving in Muggia, if one has to go to Trieste, the ferry is a probably nicer way to finish a trek than the bus.

Pitching a tent

We used our tent almost every night, only in Idrija we got a room (and a shower…). It was not trivial to find good camp spots, because most of the trail is on hills with slopes, and the flat spots tend to have housed built on them, but certainly possible. Sometimes we hid in the forest, other times we found nice small and freshly mowed meadows within the forest.

Water

Since this is Karst land, there is very little in terms of streams or lakes along the way, which is a pity.

The Idrijca river right south of Idrija was very tempting to take a plunge. Unfortunately we passed there early in the day and we wanted to cover some ground first, so we refrained.

As for drinking water, we used the taps at the bathrooms of the various touristic sites, a few (but rare) public fountains, and finally resorted to just ringing random doorbells and asking for water, which always worked.

Paths

A few stages lead you through very pleasant narrow forest paths with a sight, but not all. On some days you find yourself plodding along wide graveled or even paved forest roads, though.

Landscape and sights

The view from Nanos is amazing and, with this high peak jutting out over a wide plain, rather unique. It may seem odd that the trail goes up and down that mountain on the same day when it could go around, but it is certainly worth it.

The Karst is mostly a cultivated landscape, with lots of forestry. It is very hilly and green, which is pretty, but some might miss some craggedness. It’s not the high alps, after all, but at least they are in sight half the time.

But the upside is that there are few sights along the way that are worth visiting, in particular the the Franja Partisan Hospital hidden in a very narrow gorge, the Predjama Castle and the Škocjan Caves

31 July, 2022 09:19AM by Joachim Breitner (mail@joachim-breitner.de)

Russell Coker

Workstations With ECC RAM

The last new PC I bought was a Dell PowerEdge T110II in 2013. That model had been out for a while and I got it for under $2000. Since then the CPI has gone up by about 20% so it’s probably about $2000 in today’s money. Currently Dell has a special on the T150 tower server (the latest replacement for the T110II) which has a G6405T CPU that isn’t even twice as fast as the i3-3220 (3746 vs 2219) in the T110II according to passmark.com (AKA cpubenchmark.net). The special price is $2600. I can’t remember the details of my choices when purchasing the T110II but I recall that CPU speed wasn’t a priority and I wanted a cheap reliable server for storage and for light desktop use. So it seems that the current entry model in the Dell T1xx server line is less than twice as fast as fast as it was in 2013 while costing about 25% more! An option is to spend an extra $989 to get a Xeon E-2378 which delivers a reasonable 18,248 in that benchmark. The upside of a T150 is that is uses buffered DDR4 ECC RAM which is pretty cheap nowadays, you can get 32G for about $120.

For systems sold as workstations (as opposed to T1xx servers that make great workstations but aren’t described as such) Dell has the Precision line. The Precision 3260 “Compact Workstation” currently starts at $1740, it has a fast CPU but takes SO-DIMMs and doesn’t come with ECC RAM. So to use it as a proper workstation you need to discard the RAM and buy DDR5 unbuffered/unregistered ECC SO-DIMMS – which don’t seem to be on sale yet. The Precision 3460 is slightly larger, slightly more expensive, and also takes SO-DIMMs. The Precision 3660 starts at $2550 and takes unbuffered DDR5 ECC RAM which is available and costs half as much as the SO-DIMM equivalent would cost (if you could even buy it), but the general trend in RAM prices is that unbuffered ECC RAM is more expensive than buffered ECC RAM. The upside to Precision workstations is that the range of CPUs available is significantly faster than for the T150.

The HP web site doesn’t offer prices on their Z workstations and is generally worse than the Dell web site in most ways.

Overall I’m disappointed in the range of workstations available now. As an aside if anyone knows of any other company selling workstations in Australia that support ECC RAM then please let me know.

31 July, 2022 08:27AM by etbe

July 30, 2022

Mike Hommey

Announcing git-cinnabar 0.5.10

Git-cinnabar is a git remote helper to interact with mercurial repositories. It allows to clone, pull and push from/to mercurial remote repositories, using git.

Get it on github.

These release notes are also available on the git-cinnabar wiki.

What’s new since 0.5.9?

  • Fixed exceptions during config initialization.
  • Fixed swapped error messages.
  • Fixed correctness issues with bundle chunks with no delta node.
  • This is probably the last 0.5.x release before 0.6.0.

30 July, 2022 09:35PM by glandium

Ian Jackson

chiark’s skip-skip-cross-up-grade

Two weeks ago I upgraded chiark from Debian jessie i386 to bullseye amd64, after nearly 30 years running Debian i386. This went really quite well, in fact!

Background

chiark is my “colo” - a server I run, which lives in a data centre in London. It hosts ~200 users with shell accounts, various websites and mailing lists, moderators for a number of USENET newsgroups, and countless other services. chiark’s internal setup is designed to enable my users to do a maximum number of exciting things with a minimum of intervention from me.

chiark’s OS install dates to 1993, when I installed Debian 0.93R5, the first version of Debian to advertise the ability to be upgraded without reinstalling. I think that makes it one of the oldest Debian installations in existence.

Obviously it’s had several new hardware platforms too. (There was a prior install of Linux on the initial hardware, remnants of which can maybe still be seen in some obscure corners of chiark’s /usr/local.)

chiark’s install is also at the very high end of the installation complexity, and customisation, scale: reinstalling it completely would be an enormous amount of work. And it’s unique.

chiark’s upgrade history

chiark’s last major OS upgrade was to jessie (Debian 8, released in April 2015). That was in 2016. Since then we have been relying on Debian’s excellent security support posture, and the Debian LTS and more recently Freexian’s Debian ELTS projects and some local updates, The use of ELTS - which supports only a subset of packages - was particularly uncomfortable.

Additionally, chiark was installed with 32-bit x86 Linux (Debian i386), since that was what was supported and available at the time. But 32-bit is looking very long in the tooth.

Why do a skip upgrade

So, I wanted to move to the fairly recent stable release - Debian 11 (bullseye), which is just short of a year old. And I wanted to “crossgrade” (as its called) to 64-bit.

In the past, I have found I have had greater success by doing “direct” upgrades, skipping intermediate releases, rather than by following the officially-supported path of going via every intermediate release.

Doing a skip upgrade avoids exposure to any packaging bugs which were present only in intermediate release(s). Debian does usually fix bugs, but Debian has many cautious users, so it is not uncommon for bugs to be found after release, and then not be fixed until the next one.

A skip upgrade avoids the need to try to upgrade to already-obsolete releases (which can involve messing about with multiple snapshots from snapshot.debian.org. It is also significantly faster and simpler, which is important not only because it reduces downtime, but also because it removes opportunities (and reduces the time available) for things to go badly.

One downside is that sometimes maintainers aggressively remove compatibility measures for older releases. (And compatibililty packages are generally removed quite quickly by even cautious maintainers.) That means that the sysadmin who wants to skip-upgrade needs to do more manual fixing of things that haven’t been dealt with automatically. And occasionally one finds compatibility problems that show up only when mixing very old and very new software, that no-one else has seen.

Crossgrading

Crossgrading is fairly complex and hazardous. It is well supported by the low level tools (eg, dpkg) but the higher-level packaging tools (eg, apt) get very badly confused.

Nowadays the system is so complex that downloading things by hand and manually feeding them to dpkg is impractical, other than as a very occasional last resort.

The approach, generally, has been to set the system up to “want to” be the new architecture, run apt in a download-only mode, and do the package installation manually, with some fixing up and retrying, until the system is coherent enough for apt to work.

This is the approach I took. (In current releases, there are tools that will help but they are only in recent releases and I wanted to go direct. I also doubted that they would work properly on chiark, since it’s so unusual.)

Peril and planning

Overall, this was a risky strategy to choose. The package dependencies wouldn’t necessarily express all of the sequencing needed. But it still seemed that if I could come up with a working recipe, I could do it.

I restored most of one of chiark’s backups onto a scratch volume on my laptop. With the LVM snapshot tools and chroots. I was able to develop and test a set of scripts that would perform the upgrade. This was a very effective approach: my super-fast laptop, with local caches of the package repositories, was able to do many “edit, test, debug” cycles.

My recipe made heavy use of snapshot.debian.org, to make sure that it wouldn’t rot between testing and implementation.

When I had a working scheme, I told my users about the planned downtime. I warned everyone it might take even 2 or 3 days. I made sure that my access arrangemnts to the data centre were in place, in case I needed to visit in person. (I have remote serial console and power cycler access.)

Reality - the terrible rescue install

My first task on taking the service down was the check that the emergency rescue installation worked: chiark has an ancient USB stick in the back, which I can boot to from the BIOS. The idea being that many things that go wrong could be repaired from there.

I found that that install was too old to understand chiark’s storage arrangements. mdadm tools gave very strange output. So I needed to upgrade it. After some experiments, I rebooted back into the main install, bringing chiark’s service back online.

I then used the main install of chiark as a kind of meta-rescue-image for the rescue-image. The process of getting the rescue image upgraded (not even to amd64, but just to something not totally ancient) was fraught. Several times I had to rescue it by copying files in from the main install outside. And, the rescue install was on a truly ancient 2G USB stick which was terribly terribly slow, and also very small.

I hadn’t done any significant planning for this subtask, because it was low-risk: there was little way to break the main install. Due to all these adverse factors, sorting out the rescue image took five hours.

If I had known how long it would take, at the beginning, I would have skipped it. 5 hours is more than it would have taken to go to London and fix something in person.

Reality - the actual core upgrade

I was able to start the actual upgrade in the mid-afternoon. I meticulously checked and executed the steps from my plan.

The terrifying scripts which sequenced the critical package updates ran flawlessly. Within an hour or so I had a system which was running bullseye amd64, albeit with many important packages still missing or unconfigured.

So I didn’t need the rescue image after all, nor to go to the datacentre.

Fixing all the things

Then I had to deal with all the inevitable fallout from an upgrade.

Notable incidents:

exim4 has a new tainting system

This is to try to help the sysadmin avoid writing unsafe string interpolations. (“Little Bobby Tables.”) This was done by Exim upstream in a great hurry as part of a security response process.

The new checks meant that the mail configuration did not work at all. I had to turn off the taint check completely. I’m fairly confident that this is correct, because I am hyper-aware of quoting issues and all of my configuration is written to avoid the problems that tainting is supposed to avoid.

One particular annoyance is that the approach taken for sqlite lookups makes it totally impossible to use more than one sqlite database. I think the sqlite quoting operator which one uses to interpolate values produces tainted output? I need to investigate this properly.

LVM now ignores PVs which are directly contained within LVs by default

chiark has LVM-on-RAID-on-LVM. This generally works really well.

However, there was one edge case where I ended up without the intermediate RAID layer. The result is LVM-on-LVM.

But recent versions of the LVM tools do not look at PVs inside LVs, by default. This is to help you avoid corrupting the state of any VMs you have on your system. I didn’t know that at the time, though. All I knew was that LVM was claiming my PV was “unusable”, and wouldn’t explain why.

I was about to start on a thorough reading of the 15,000-word essay that is the commentary in the default /etc/lvm/lvm.conf to try to see if anything was relevant, when I received a helpful tipoff on IRC pointing me to the scan_lvs option.

I need to file a bug asking for the LVM tools to explain why they have declared a PV unuseable.

apache2’s default config no longer read one of my config files

I had to do a merge (of my changes vs the maintainers’ changes) for /etc/apache2/apache2.conf. When doing this merge I failed to notice that the file /etc/apache2/conf.d/httpd.conf was no longer included by default. My merge dropped that line. There were some important things in there, and until I found this the webserver was broken.

dpkg --skip-same-version DTWT during a crossgrade

(This is not a “fix all the things” - I found it when developing my upgrade process.)

When doing a crossgrade, one often wants to say to dpkg “install all these things, but don’t reinstall things that have already been done”. That’s what --skip-same-version is for.

However, the logic had not been updated as part of the work to support multiarch, so it was wrong. I prepared a patched version of dpkg, and inserted it in the appropriate point in my prepared crossgrade plan.

The patch is now filed as bug #1014476 against dpkg upstream

Mailman

Mailman is no longer in bullseye. It’s only available in the previous release, buster.

bullseye has Mailman 3 which is a totally different system - requiring basically, a completely new install and configuration. To even preserve existing archive links (a very important requirement) is decidedly nontrivial.

I decided to punt on this whole situation. Currently chiark is running buster’s version of Mailman. I will have to deal with this at some point and I’m not looking forward to it.

Python

Of course that Mailman is Python 2. The Python project’s extremely badly handled transition includes a recommendation to change the meaning of #!/usr/bin/python from Python 2, to Python 3.

But Python 3 is a new language, barely compatible with Python 2 even in the most recent iterations of both, and it is usual to need to coinstall them.

Happily Debian have provided the python-is-python2 package to make things work sensibly, albeit with unpleasant imprecations in the package summary description.

USENET news

Oh my god. INN uses many non-portable data formats, which just depend on your C types. And there are complicated daemons, statically linked libraries which cache on-disk data, and much to go wrong.

I had numerous problems with this, and several outages and malfunctions. I may write about that on a future occasion.

(edited 2022-07-20 11:36 +01:00 and 2022-07-30 12:28+01:00 to fix typos)


comment count unavailable comments

30 July, 2022 11:27AM

July 29, 2022

hackergotchi for Bits from Debian

Bits from Debian

New Debian Developers and Maintainers (May and June 2022)

The following contributors got their Debian Developer accounts in the last two months:

  • Geoffroy Berret (kaliko)
  • Arnaud Ferraris (aferraris)

The following contributors were added as Debian Maintainers in the last two months:

  • Alec Leanas
  • Christopher Michael Obbard
  • Lance Lin
  • Stefan Kropp
  • Matteo Bini
  • Tino Didriksen

Congratulations!

29 July, 2022 02:00PM by Jean-Pierre Giraud

July 28, 2022

hackergotchi for Matthew Garrett

Matthew Garrett

UEFI rootkits and UEFI secure boot

Kaspersky describes a UEFI-implant used to attack Windows systems. Based on it appearing to require patching of the system firmware image, they hypothesise that it's propagated by manually dumping the contents of the system flash, modifying it, and then reflashing it back to the board. This probably requires physical access to the board, so it's not especially terrifying - if you're in a situation where someone's sufficiently enthusiastic about targeting you that they're reflashing your computer by hand, it's likely that you're going to have a bad time regardless.

But let's think about why this is in the firmware at all. Sophos previously discussed an implant that's sufficiently similar in some technical details that Kaspersky suggest they may be related to some degree. One notable difference is that the MyKings implant described by Sophos installs itself into the boot block of legacy MBR partitioned disks. This code will only be executed on old-style BIOS systems (or UEFI systems booting in BIOS compatibility mode), and they have no support for code signatures, so there's no need to be especially clever. Run malicious code in the boot block, patch the next stage loader, follow that chain all the way up to the kernel. Simple.

One notable distinction here is that the MBR boot block approach won't be persistent - if you reinstall the OS, the MBR will be rewritten[1] and the infection is gone. UEFI doesn't really change much here - if you reinstall Windows a new copy of the bootloader will be written out and the UEFI boot variables (that tell the firmware which bootloader to execute) will be updated to point at that. The implant may still be on disk somewhere, but it won't be run.

But there's a way to avoid this. UEFI supports loading firmware-level drivers from disk. If, rather than providing a backdoored bootloader, the implant takes the form of a UEFI driver, the attacker can set a different set of variables that tell the firmware to load that driver at boot time, before running the bootloader. OS reinstalls won't modify these variables, which means the implant will survive and can reinfect the new OS install. The only way to get rid of the implant is to either reformat the drive entirely (which most OS installers won't do by default) or replace the drive before installation.

This is much easier than patching the system firmware, and achieves similar outcomes - the number of infected users who are going to wipe their drives to reinstall is fairly low, and the kernel could be patched to hide the presence of the implant on the filesystem[2]. It's possible that the goal was to make identification as hard as possible, but there's a simpler argument here - if the firmware has UEFI Secure Boot enabled, the firmware will refuse to load such a driver, and the implant won't work. You could certainly just patch the firmware to disable secure boot and lie about it, but if you're at the point of patching the firmware anyway you may as well just do the extra work of installing your implant there.

I think there's a reasonable argument that the existence of firmware-level rootkits suggests that UEFI Secure Boot is doing its job and is pushing attackers into lower levels of the stack in order to obtain the same outcomes. Technologies like Intel's Boot Guard may (in their current form) tend to block user choice, but in theory should be effective in blocking attacks of this form and making things even harder for attackers. It should already be impossible to perform attacks like the one Kaspersky describes on more modern hardware (the system should identify that the firmware has been tampered with and fail to boot), which pushes things even further - attackers will have to take advantage of vulnerabilities in the specific firmware they're targeting. This obviously means there's an incentive to find more firmware vulnerabilities, which means the ability to apply security updates for system firmware as easily as security updates for OS components is vital (hint hint if your system firmware updates aren't available via LVFS you're probably doing it wrong).

We've known that UEFI rootkits have existed for a while (Hacking Team had one in 2015), but it's interesting to see a fairly widespread one out in the wild. Protecting against this kind of attack involves securing the entire boot chain, including the firmware itself. The industry has clearly been making progress in this respect, and it'll be interesting to see whether such attacks become more common (because Secure Boot works but firmware security is bad) or not.

[1] As we all remember from Windows installs overwriting Linux bootloaders
[2] Although this does run the risk of an infected user booting another OS instead, and being able to see the implant

comment count unavailable comments

28 July, 2022 10:19PM

Dominique Dumont

How I investigated connection hogs on Kubernetes

Hi

My name is Dominhique Dumont, DevOps freelance in Grenoble, France.

My goal is to share my experience regarding a production issue that occurred last week where my client complained that the applications was very slow and sometime showed 5xx errors. The production service is hosted on a Kubernetes cluster on Azure and use a MongoDB on ScaleGrid.

I reproduced the issue on my side and found that the API calls were randomly failing due to timeouts on server side.

The server logs were showing some MongoDB disconnections and reconnections and some time-out on MongoDB connections, but did not give any clue on why some connections to MongoDB server were failing.

Since there was not clue in the cluster logs, I looked at ScaleGrid monitoring. There was about 2500 connections on MongoDB: 2022-07-19-scalegrid-connection-leak.png That seemed quite a lot given the low traffic at that time, but not necessarily a problem.

Then, I went to the Azure console, and I got the first hint about the origin of the problem: the SNATs were exhausted on some nodes of the clusters. 2022-07-28_no-more-free-snat.png

SNATs are involved in connections from the cluster to the outside world, i.e. to our MongoDB server and are quite limited: only 1024 SNAT ports are available per node. This was consistent with the number of used connections on MongoDB.

OK, then the number of used connections on MongoDB was a real problem.

The next question was: which pods and how many connections ?

First I had to filter out the pods that did not use MongoDB. Fortunately, all our pods have labels so I could list all pods using MongoDB:

$ kubectl -n prod get pods -l db=mongo | wc -l
236

Hmm, still quite a lot.

Next problem is to check which pod used too many MongoDB connections. Unfortunately, the logs mentioned that a connection to MongoDB was opened, but that did not give a clue on how many were used.

Netstat is not installed on the pods, and cannot be installed since the pods are running as root (which is a good idea for security reasons)

Then, my Debian Developer experience kicked in and I remembered that /proc file system on Linux gives a lot of information on consumed kernel resources, including resources consumed by each process.

The trick is to know the PID of the process using the connections.

In our case, Docker files are written in a way so the main process of a pod using NodeJS is 1, so, the command to list the connections of pod is:

$ kubectl -n prod exec redacted-pod-name-69875496f8-8bj4f -- cat /proc/1/net/tcp
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: AC00F00A:C9FA C2906714:6989 01 00000000:00000000 02:00000DA9 00000000  1001        0 376439162 2 0000000000000000 21 4 0 10 -1                 
   1: AC00F00A:CA00 C2906714:6989 01 00000000:00000000 02:00000E76 00000000  1001        0 376439811 2 0000000000000000 21 4 0 10 -1                 
   2: AC00F00A:8ED0 C2906714:6989 01 00000000:00000000 02:000004DA 00000000  1001        0 445806350 2 0000000000000000 21 4 30 10 -1                
   3: AC00F00A:CA02 C2906714:6989 01 00000000:00000000 02:000000DD 00000000  1001        0 376439812 2 0000000000000000 21 4 0 10 -1                 
   4: AC00F00A:C9FE C2906714:6989 01 00000000:00000000 02:00000DA9 00000000  1001        0 376439810 2 0000000000000000 21 4 0 10 -1                 
   5: AC00F00A:8760 C2906714:6989 01 00000000:00000000 02:00000810 00000000  1001        0 375803096 2 0000000000000000 21 4 0 10 -1                 
   6: AC00F00A:C9FC C2906714:6989 01 00000000:00000000 02:00000DA9 00000000  1001        0 376439809 2 0000000000000000 21 4 0 10 -1                 
   7: AC00F00A:C56C C2906714:6989 01 00000000:00000000 02:00000DA9 00000000  1001        0 376167298 2 0000000000000000 21 4 0 10 -1                 
   8: AC00F00A:883C C2906714:6989 01 00000000:00000000 02:00000734 00000000  1001        0 375823415 2 0000000000000000 21 4 30 10 -1 

OK, that’s less appealing that netstat output. The trick is that rem_address and port are expressed in hexa. A quick calculation confirms the port 0x6989 is indeed port 27017, which is the listening port of MongoDB server.

So the number of opened MongoDB connections is given by:

$ kubectl -n prod exec redacted-pod-name-69875496f8-8bj4f -- cat /proc/1/net/tcp | grep :6989 | wc -l
9

What’s next ?

The ideal solution would be to fix the NodeJS code to handle correctly the termination of the connections, but that would have taken too long to develop.

So I’ve written a small Perl script to:

  • list the pods using MongoDB using kubectl -n prod get pods -l db=mongo
  • find the pods using more that 10 connections using the kubectl exec command shown above
  • compute the deployment name of these pods (which was possible given the naming convention used with our pods and deployments)
  • restart the deployment of these pods with a kubectl rollout restart deployment command

Why restart a deployment instead of simply deleting the gluttonous pods? I wanted to avoid downtime if all pods of a deployment were to be killed. There’s no downtime when applying rollout restart command on deployments.

This script is now run regularly until the connections issue is fixed for good in NodeJS code. Thanks to this script, there’s no need to rush a code modification.

All in all, working around this connections issues was made somewhat easier thanks to:

  • the monitoring tools provided by the hosting services.
  • a good knowledge of Linux internals
  • consistent labels on our pods
  • the naming conventions used for our kubernetes artifacts

28 July, 2022 12:10PM by dod

July 27, 2022

Vincent Bernat

ClickHouse SF Bay Area Meetup: Akvorado

Here are the slides I presented for a ClickHouse SF Bay Area Meetup in July 2022, hosted by Altinity. They are about Akvorado, a network flow collector and visualizer, and notably on how it relies on ClickHouse, a column-oriented database.

The meetup was recorded and available on YouTube. Here is the part relevant to my presentation, with subtitles:1

I got a few questions about how to get information from the higher layers, like HTTP. As my use case for Akvorado was at the network edge, my answers were mostly negative. However, as sFlow is extensible, when collecting flows from Linux servers instead, you could embed additional data and they could be exported as well.

I also got a question about doing aggregation in a single table. ClickHouse can aggregate automatically data using TTL. My answer for not doing that is partial. There is another reason: the retention periods of the various tables may overlap. For example, the main table keeps data for 15 days, but even in these 15 days, if I do a query on a 12-hour window, it is faster to use the flows_1m0s aggregated table, unless I request something about ports and IP addresses.


  1. To generate the subtitles, I have used Amazon Transcribe, the speech-to-text solution from Amazon AWS. Unfortunately, there is no en-FR language available, which would have been useful for my terrible accent. While the subtitles were 100% accurate when the host, Robert Hodge from Altinity, was speaking, the success rate on my talk was quite lower. I had to rewrite almost all sentences. However, using speech-to-text is still useful to get the timings, as it is also something requiring a lot of work to do manually. ↩︎

27 July, 2022 09:00PM by Vincent Bernat

July 26, 2022

Debian Suicide FYI

Colin Watson, Steve McIntyre & Debian, Ubuntu cover-up mission after Frans Pop suicide

We present a new email today revealing that Steve McIntyre (sledge), formerly of ARM Ltd, was invited to remove the servers from the home of Frans Pop.

The email reveals that Debian was Pop's entire way of life. He lived with his servers doing free unpaid work for different Debian teams.

If a volunteer kills himself and gives you a house full of servers it smells a lot like some Debian people got a reward for working Frans Pop to death.

Therefore, if there are no consequences for killing somebody, if they get a reward from his assets, there is no reason for these people to change their behavior towards volunteers.

To put the email in context, remember that Colin Watson was trying to downplay the significance of that phrase "His main concern was his work for Debian". Yet he lived surrounded by his computers.

We could think of this visit to Frans Pop's house as some sort of cover-up mission.

McIntyre also tries to use Pop's previous cancer as a scapegoat for the suicide. His musings about this are absurd. McIntyre lacks the competence to make judgments about such medical matters, just as WIPO lawyers lack the competence to judge who is or isn't a developer. McIntyre's comments about a cancer show us that Debian people are simply unable to admit there is a possibility that they worked Frans Pop to death.

Subject: Re: Death of Frans Pop
Date: Sun, 22 Aug 2010 12:21:47 +0100
From: Steve McIntyre <steve@einval.com>
To: debian-private@lists.debian.org

I'm saddened to see the news about Frans triggering arguments, but I
suppose it's not too much of a surprise - shock and grief can cause
all kinds of reactions in people. :-(

On Sat, Aug 21, 2010 at 11:47:34AM +0100, Steve McIntyre wrote:
>Hi all,
>
>I have bad news to share with people, I'm afraid. This morning, I've
>just received an email from the parents of Frans Pop telling me that
>he died yesterday.
>
>"Yesterday morning our son Frans Pop has died. He took his own life,
>in a well-considered, courageous, and considerate manner. During the
>last years his main concern was his work for Debian. I would like to
>ask you to inform those members of the Debian community who knew him
>well."

I've had another email from them today. Something that many/most
people will not have known before now was that Frans had been
suffering from thyroid cancer. He went into hospital a couple of years
ago for treatment and only mentioned it to a few of us at the time. He
had not mentioned it since, leading me to assume that he was
cured. Now I'm not sure either way.

I didn't mention this illness to people here yesterday while I asked
his father if it might have been a factor in Frans' choice to end his
life. I've just had confirmation this morning that apparently it was
*not*. Frans had other reasons, although I'm still personally
wondering if there might have some contribution.

I'm going to ask them about a funeral/memorial service and whether or
not some of Frans' Debian colleagues and friends will be welcome. If
so, I'm planning to head along myself and I trust a number of more
local Debian folk will too. I'll also ask whether an official
statement or dedication (of some sort) on behalf of the project will
be acceptable.

Finally, a more mundane matter. Frans was hosting/using a number of
machines at his house and asked that they be passed back to Debian.
Please contact me *off-list* if you can help. His parents live a fair
way from the town where he lived, so will need to arrange to travel
there to meet people. They'd therefore appreciate it if one person can
take care of everything.

Please respect the privacy of this information; I'm expecting that
we'll make a co-ordinated public statement of some sort in the near
future.

Thanks,
-- 
Steve McIntyre, Cambridge, UK.                                steve@einval.com
"I can't ever sleep on planes ... call it irrational if you like, but I'm
 afraid I'll miss my stop" -- Vivek Dasmohapatra

Watson's earlier email downplaying the role of Debian in the suicide:

Subject: Re: Death of Frans Pop
Date: Sat, 21 Aug 2010 13:39:21 +0100
From: Colin Watson <cjwatson@debian.org>
To: debian-private@lists.debian.org

On Sat, Aug 21, 2010 at 01:52:33PM +0200, Ludovic Brenta wrote:
> Steve McIntyre <steve@einval.com> writes:
> > "Yesterday morning our son Frans Pop has died. He took his own life,
> > in a well-considered, courageous, and considerate manner. During the
> > last years his main concern was his work for Debian. I would like to
> > ask you to inform those members of the Debian community who knew him
> > well."
> 
> Does that imply he took his own life *because* of Debian, which was "his
> main concern"?

This is probably the wrong thread for linguistics, but that phrase would
normally just indicate that Debian was his main interest.  In
http://oxforddictionaries.com/view/entry/m_en_gb0169810 under "noun",
this would be sense 2 rather than sense 1.

-- 
Colin Watson                                       [cjwatson@debian.org]
Frans Pop, Colin Watson, Steve McIntyre, sledge, debian, ubuntu, suicide

26 July, 2022 12:00PM

hackergotchi for Wouter Verhelst

Wouter Verhelst

Planet Grep now running PtLink

Almost 2 decades ago, Planet Debian was created using the "planetplanet" RSS aggregator. A short while later, I created Planet Grep using the same software.

Over the years, the blog aggregator landscape has changed a bit. First of all, planetplanet was abandoned, forked into Planet Venus, and then abandoned again. Second, the world of blogging (aka the "blogosphere") has disappeared much, and the more modern world uses things like "Social Networks", etc, making blogs less relevant these days.

A blog aggregator community site is still useful, however, and so I've never taken Planet Grep down, even though over the years the number of blogs that was carried on Planet Grep has been reducing. In the past almost 20 years, I've just run Planet Grep on my personal server, upgrading its Debian release from whichever was the most recent stable release in 2005 to buster, never encountering any problems.

That all changed when I did the upgrade to Debian bullseye, however. Planet Venus is a Python 2 application, which was never updated to Python 3. Since Debian bullseye drops support for much of Python 2, focusing only on Python 3 (in accordance with python upstream's policy on the matter), that means I have had to run Planet Venus from inside a VM for a while now, which works as a short-term solution but not as a long-term one.

Although there are other implementations of blog aggregation software out there, I wanted to stick with something (mostly) similar. Additionally, I have been wanting to add functionality to it to also pull stuff from Social Networks, where possible (and legal, since some of these have... scary Terms Of Use documents).

So, as of today, Planet Grep is no longer powered by Planet Venus, but instead by PtLink. Rather than Python, it was written in Perl (a language with which I am more familiar), and there are plans for me to extend things in ways that have little to do with blog aggregation anymore...

There are a few other Planets out there that also use Planet Venus at this point -- Planet Debian and Planet FSFE are two that I'm currently already aware of, but I'm sure there might be more, too.

At this point, PtLink is not yet on feature parity with Planet Venus -- as shown by the fact that it can't yet build either Planet Debian or Planet FSFE successfully. But I'm not stopping my development here, and hopefully I'll have something that successfully builds both of those soon, too.

As a side note, PtLink is not intended to be bug compatible with Planet Venus. For one example, the configuration for Planet Grep contains an entry for Frederic Descamps, but somehow Planet Venus failed to fetch his feed. With the switch to PtLink, that seems fixed, and now some entries from Frederic seem to appear. I'm not going to be "fixing" that feature... but of course there might be other issues that will appear. If that's the case, let me know.

If you're reading this post through Planet Grep, consider this a public service announcement for the possibility (hopefully a remote one) of minor issues.

26 July, 2022 10:15AM

July 25, 2022

hackergotchi for Bits from Debian

Bits from Debian

DebConf22 closes in Prizren and DebConf23 dates announced

DebConf22 group photo - click to enlarge

On Sunday 24 July 2022, the annual Debian Developers and Contributors Conference came to a close. Hosting 260 attendees from 38 different countries over a combined 91 event talks, discussion sessions, Birds of a Feather (BoF) gatherings, workshops, and activities, DebConf22 was a large success.

The conference was preceded by the annual DebCamp held 10 July to 16 July which focused on individual work and team sprints for in-person collaboration towards developing Debian. In particular, this year there have been sprints to advance development of Mobian/Debian on mobile, reproducible builds and Python in Debian, and a BootCamp for newcomers, to get introduced to Debian and have some hands-on experience with using it and contributing to the community.

The actual Debian Developers Conference started on Sunday 17 July 2022. Together with activities such as the traditional 'Bits from the DPL' talk, the continuous key-signing party, lightning talks and the announcement of next year's DebConf (DebConf23 in Kochi, India), there were several sessions related to programming language teams such as Python, Perl and Ruby, as well as news updates on several projects and internal Debian teams, discussion sessions (BoFs) from many technical teams (Long Term Support, Android tools, Debian Derivatives, Debian Installer and Images team, Debian Science...) and local communities (Debian Brasil, Debian India, the Debian Local Teams), along with many other events of interest regarding Debian and free software.

The schedule was updated each day with planned and ad-hoc activities introduced by attendees over the course of the entire conference. Several activities that couldn\'t be organized in past years due to the COVID pandemic returned to the conference\'s schedule: a job fair, open-mic and poetry night, the traditional Cheese and Wine party, the group photos and the Day Trip.

For those who were not able to attend, most of the talks and sessions were recorded for live streams with videos made, available through the Debian meetings archive website. Almost all of the sessions facilitated remote participation via IRC messaging apps or online collaborative text documents.

The DebConf22 website will remain active for archival purposes and will continue to offer links to the presentations and videos of talks and events.

Next year, DebConf23 will be held in Kochi, India, from September 10 to September 16, 2023. As tradition follows before the next DebConf the local organizers in India will start the conference activites with DebCamp (September 03 to September 09, 2023), with particular focus on individual and team work towards improving the distribution.

DebConf is committed to a safe and welcome environment for all participants. See the web page about the Code of Conduct in DebConf22 website for more details on this.

Debian thanks the commitment of numerous sponsors to support DebConf22, particularly our Platinum Sponsors: Lenovo, Infomaniak, ITP Prizren and Google.

About Debian

The Debian Project was founded in 1993 by Ian Murdock to be a truly free community project. Since then the project has grown to be one of the largest and most influential open source projects. Thousands of volunteers from all over the world work together to create and maintain Debian software. Available in 70 languages, and supporting a huge range of computer types, Debian calls itself the universal operating system.

About DebConf

DebConf is the Debian Project's developer conference. In addition to a full schedule of technical, social and policy talks, DebConf provides an opportunity for developers, contributors and other interested people to meet in person and work together more closely. It has taken place annually since 2000 in locations as varied as Scotland, Argentina, and Bosnia and Herzegovina. More information about DebConf is available from https://debconf.org/.

About Lenovo

As a global technology leader manufacturing a wide portfolio of connected products, including smartphones, tablets, PCs and workstations as well as AR/VR devices, smart home/office and data center solutions, Lenovo understands how critical open systems and platforms are to a connected world.

About Infomaniak

Infomaniak is Switzerland\'s largest web-hosting company, also offering backup and storage services, solutions for event organizers, live-streaming and video on demand services. It wholly owns its datacenters and all elements critical to the functioning of the services and products provided by the company (both software and hardware).

About ITP Prizren

Innovation and Training Park Prizren intends to be a changing and boosting element in the area of ICT, agro-food and creatives industries, through the creation and management of a favourable environment and efficient services for SMEs, exploiting different kinds of innovations that can contribute to Kosovo to improve its level of development in industry and research, bringing benefits to the economy and society of the country as a whole.

About Google

Google is one of the largest technology companies in the world, providing a wide range of Internet-related services and products such as online advertising technologies, search, cloud computing, software, and hardware.

Google has been supporting Debian by sponsoring DebConf for more than ten years, and is also a Debian partner sponsoring parts of Salsa's continuous integration infrastructure within Google Cloud Platform.

Contact Information

For further information, please visit the DebConf22 web page at https://debconf22.debconf.org/ or send mail to press@debian.org.

25 July, 2022 08:30AM by Debian Publicity Team

July 24, 2022

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

AV1 live streaming: Exploring SVT-AV1 rate control

I'm looking into AV1 live streaming these days; it's still very early, but it looks like enough of the required parts may finally align, and it seems it's the way I'll have to go to get to that next quality level. (Specifically, I'd like to go from 720p60 to 1080p60 for sports, and it seems this is hard to do under H.264 as-is without making pretty big concessions in terms of artifacts/smudges, or else jack up the bitrate so much that clients will start having viewing problems.)

After some brief testing, it seems SVT-AV1 is the obvious choice; if you've got the cores, it produces pretty good-looking 10-bit AV1 using less CPU time than x264 veryfast (!), possibly mostly due to better parallelization. But information about using it for live streaming was hard to find, and asking online turned up zero useful information. So I did some practical tests for live-specific issues, starting with rate control.

First of all, we need to identify which problem we want to solve. For a live stream, there are two good reasons to have good rate control:

  • Bandwidth costs money, both for ourselves and for the client.
  • The client should be able to watch the stream without buffering.

The former is about long-term averages, the latter is about short-term averages. Usually, we ignore the former and focus mostly on the latter (especially since solving the latter will keep the former mostly or completely in check).

My testing is empirical and mostly a spot-check; I don't have a large library of interesting high-quality video, nor do I have the patience to run through it. As sample clip, I chose the first 60 seconds (without audio) of cathodoluminescence by mfx and holon; it is a very challenging clip both encoding- and rate control-wise (it goes from all black to spinning and swooshing things with lots of noise on top, with huge complexity swings on the order of seconds), and I happened to have a high-quality 1080p60 recording that I could use as a master. We'll encode this to match a hypothetical 3 Mbit/sec viewer, to really give the encoder a run for its money. Most clips will be much easier than this, but there's always more to see in the hard cases than the easy ones.

First, let's check what happens without rate control; I encoded the clip using SVT-AV1 at preset 10, which is comfortably realtime on my 28-core Broadwell. (I would assume it's also good at my 16-core Zen 3, since it is much higher clocked, but I haven't checked.) I used constant quantizer, ie., there is no rate control at all; every frame, easy or hard, is encoded at the same quality. (I encoded the clip several times with different quantizers to find one that got me close to 3000 kbit/sec. Obviously, in a real-time scenario, we would have no such luxury.) With the addition of FFmpeg as the driver and some Perl to analyze it afterwards, this is what I got:

Flags: -c:v libsvtav1 -pix_fmt yuv420p10le -preset 10 -qp 54

Histogram of rates over 1-second blocks:

  250  ********
  750  *****************
 1250  ***********
 1750  ******
 2250  ***
 2750  ***
 3250  *
 3750  *
 4250  *****
 4750  
 5250  
----- ----- ----- ----- -----
 9250  *
13250  *
17750  **
35250  *

Min:    11 kbit/sec
Max: 35020 kbit/sec
Avg:  2914 kbit/sec


Primitive VBV with 3000 kbit max buffer (starting at 100% full), 3000 kbit/sec network:

Buffer minimum fill:        0 kbit
Time stalled:           25448 ms
Time with full buffer:  27157 ms

VMAF:                   57.39

Some explanations are in order here. What I've done is pretty simplistic; chop the resulting video into one-second blocks, and then measure how many bytes those are. You can see that even though the average bit rate is near our 3000 kbit/sec target, the majority of the time is actually spent around 500–1500 kbit/sec. But some seconds are huge outliers; up to 29 Mbit/sec.

The next section is my toy VBV (video buffer verifier), which simulates a client downloading at a constant 3000 kbit/sec rate (as long as the buffer, set to one second, has room for it) and playing frames according to their timestamps. We can see that even though we're below the target bitrate, we spend a whopping 25 seconds buffering—for a 60 second clip! This is because most of the time, our buffer sits there comfortably full, which is blocking mor downloads until we get to those problematic sections where the bitrate goes sky-high, and we fall behind really quickly. (Why not allow our buffer to go more-than-full, which would fix the problem? Well, first of all, this assumes the encoder has a huge delay so that it could actually feed data for those frames way ahead of play time, or they would simply not exist yet. Second, what about clients that joined in the middle of the stream?)

Note that my VBV script is not a standards-compliant verifier (e.g. it doesn't really take B-frames into account), so you'll need to take it with a grain of salt; still, it's a pretty good proxy for what's going on.

OK, so let's now test what happens with a known-good case; we encode with x264 and CBR settings matching our VBV:

Flags: -c:v libx264 -pix_fmt yuv420p10le -preset veryfast -x264-params "nal-hrd=cbr"
       -b:v 3M -minrate 3M -maxrate 3M -bufsize 3M

Histogram of rates over 1-second blocks:

  250  
  750  
 1250  
 1750  
 2250  *****
 2750  *******************
 3250  *******************************
 3750  *****

Min:  2032 kbit/sec
Max:  3968 kbit/sec
Avg:  2999 kbit/sec


Primitive VBV with 3000 kbit max buffer (starting at 100% full), 3000 kbit/sec network:

Buffer minimum fill:     1447 kbit
Time stalled:               0 ms
Time with full buffer:    128 ms

VMAF:                   50.29

This is spot-on. The global average is within 1 kbit/sec of what we asked for, each second is nicely clustered around our range, and we never stall. In fact, our buffer hardly goes past half-full. (Don't read too much into the VMAF numbers, as I didn't ask either codec to optimize for visual quality. Still, it's not unexpected that we get higher values for AV1, and that neither codec really manages to good quality at these rates.)

Going back to AV1, we now move from constant quantizer to asking for a given bitrate. SVT-AV1 defaults to one-pass VBR, so we'll see what happens if we just give it a bitrate:

Flags: -c:v libsvtav1 -pix_fmt yuv420p10le -preset 10 -b:v 3M

Histogram of rates over 1-second blocks:

  250  **
  750  *******
 1250  **********
 1750  *****
 2250  **********
 2750  ************
 3250  ****
 3750  
 4250  *****
 4750  *
 5250  *
 5750  **
----- ----- ----- ----- -----
 7250  *

Min:    10 kbit/sec
Max:  7212 kbit/sec
Avg:  2434 kbit/sec


Primitive VBV with 3000 kbit max buffer (starting at 100% full), 3000 kbit/sec network:

Buffer minimum fill:        0 kbit
Time stalled:            3207 ms
Time with full buffer:  14639 ms

VMAF:                   61.81

It's not fantastic for streaming purposes (it's not designed for it either!), but it's much better than constant QP; the global average undershot a fair amount, and we still have some outliers causing stalls, but much less. Perhaps surprisingly, VMAF is significantly higher compared to constant QP (now roughly in “fair quality” territory), even though the overall rate is lower; the average frame just is much more important for quality. (Note that SVT-AV1 is not deterministic if you are using multithreading and rate control together, so if you run a second time, you could get different results.)

There is a “max bit rate” flag, too, but it seems not to do much for this clip (I don't even know if it's relevant for anything except capped CRF?), so I won't bore you with an identical set of data. Instead, let's try the CBR mode added in 1.0.0 (rc=2):

Svt[warn]: CBR Rate control is currently not supported for PRED_RANDOM_ACCESS, switching to VBR

Uh, OK. Switching to PRED_LOW_DELAY_B, then (pred-struct=1, helpfully undocumented):

Svt[warn]: Forced Low delay mode to use HierarchicalLevels = 3
Svt[warn]: Instance 1: The low delay encoding mode is a work-in-progress
project, and is only available for demos, experimentation, and further
development uses and should not be used for benchmarking until fully
implemented.
Svt[warn]: TPL is disabled in low delay applications.
Svt[info]: Number of logical cores available: 3

Ugh. So we're into experimental land, no TPL (SVT-AV1's variant of x264's mb-tree), and a maximum of three cores used. This means CBR is much slower; less than half the speed or so in these tests, and below the realtime threshold on this machine unless I reduce the preset. Still, let's see what it produces:

Flags: -c:v libsvtav1 -pix_fmt yuv420p10le -preset 10 -b:v 3M
       -svtav1-params pred-struct=1:rc=2

Histogram of rates over 1-second blocks:

  250  **
  750  *
 1250  *
 1750  ***
 2250  *********
 2750  ********************
 3250  ****************
 3750  *
 4250  ***
 4750  
 5250  
----- ----- ----- ----- -----
 6250  *
 6750  *
 7250  *
 7750  *

Min:    42 kbit/sec
Max:  7863 kbit/sec
Avg:  2998 kbit/sec


Primitive VBV with 3000 kbit max buffer (starting at 100% full), 3000 kbit/sec network:

Buffer minimum fill:        0 kbit
Time stalled:            4970 ms
Time with full buffer:   5522 ms

VMAF:                   61.53

This is not quite what we expected. The global average is now spot-on, but we are still bothered with outliers—and we're having more stalls than with the VBR mode (possibly because the lower bitrate overall helped a bit). Also note that the VMAF is no better, despite using more bitrate!

I believe these stalls point to a bug or shortcoming in SVT-AV1's CBR mode, so I've reported it, and we'll see what happens. But still, the limitations the low-delay prediction structure imposes on us (with associated quality loss) makes this a not terribly attractive option; it seems that this mode is a bit too new for serious use (perhaps not surprising, given the warnings it spits out).

So what is the best bet? I'd say that currently (as of git master, soon-to-be 1.2.0), it is using the default one-pass VBR mode (two-pass VBR obviously is a no-go for live streaming). Yes, it will fail VBV sometimes, but in practice, clients will usually have some headroom; again, we tune our bit rates lower than we'd need if buffering were our only constraint (to reduce people's bandwidth bills). It would be interesting to see how this pans out across a larger set of clips at some point; after all, most content isn't nearly as tricky as this.

There is still lots of exploration left to do; in particular, muxing the stream and getting it to actually play in browsers will be… fun? More to come, although I can't say exactly when.

24 July, 2022 02:40PM

July 23, 2022

Free Software Fellowship

List of Open Source suicides and accidents: volunteers and developers down

Volunteers are currently working to try and decode the Frans Pop Debian Day suicide. Here at the Fellowship we thought it would be helpful to look at Pop's case in the context of all the other suicides and accidental deaths across the entire open source ecosystem.

The Open Source mafia has been putting far too much pressure on volunteers in recent years. We decided to look at the cases of those who didn't survive.

We feel these cases demonstrate there are issues in the open source challenge to work/life balance and the systematic pushing of volunteers to work for free.

If you know any other cases or if you have more evidence about the cases listed already, please write to supporter@fsfellowship.news.

Frans Pop, Debian "Community", 2010 (suicide confirmed)

Mid-40s, single.

This is by far the most extraordinary case. Frans Pop sent multiple written emails about his grievances with Debian/Ubuntu culture. He sent a written resignation on the debian-private (leaked) gossip network the night before Debian Day. His suicide only took place four days later but it would appear he was contemplating it for Debian Day itself. His parents recovered a written suicide note. They sent an email to Steve McIntyre at Debian mentioning that Pop's main concern was his work for Debian.

See the Debian Suicide FYI for more details

Arjen Kamphuis, Infosec community (disappearance)

Aged 46

Kamphuis disappeared near Bodo, Norway on 20 August 2018. Coincidentally, it is the same day that Frans Pop committed suicide but eight years later. They are both from the same country, the Netherlands.

It is rumoured that Kamphuis was helping Wikileaks. Coincidentally, Kamphuis was born on Australia Day and Wikileaks was founded by Washington's most wanted Australian, the journalist Julian Assange.

Ian Murdock, Debian "Community", 2015 (suicide confirmed)

Aged 42, divorced, 3 children.

Murdock was the founder of Debian and a highly successful and well known engineer. It appears that he found himself alone in police custody during the Christmas season. This is a reflection of the impact that Debian may be having on the family life of volunteers.

Lucy Wayland, (formerly Jon Ward), Debian "Community", 2019 (coroner's report: accident)

Aged 46, single, no children, trans

Lucy Wayland had speculated about suicide in online postings.

The coroner's report tells us the death was an accident under the influence of alcohol. This type of alcoholism is considered to be a form of self-harm.

Lucy Wayland called for help, this is noted in the coroner's report.

Lucy's accident occured during a period of intense negativity in Debian, the Debian Christmas lynchings of 2018.

Richard Rothwell, FSFE "Community", 2009 (suicide confirmed)

47 years old, married with two adult children.

Richard was a free software activist in the UK.

Software Coop published a blog.

There is an archived tribute page with many details and names of collaborators.

Stourbridge News published a report about his disappearance.

This appears to be an FSFE connected suicide. FSFE are the Germans who pretend to follow Richard Stallman while in fact over 60 percent of their funding comes from Google, Red Hat and Ubuntu. Georg Greve and Karsten Gerloff both commented on the tribute page. We found Richard's name in the FSFE-UK mailing list archives.

We found a longer history of his activities on this page.

He published a paper about his work with LTSP in schools.

Aaron Swartz, Creative Commons & others, 2008 (suicide confirmed)

26 years, single, no children.

Wikipedia page about Aaron Swartz.

Swartz had been persecuted/prosecuted by US federal prosecutors in relation to distribution of academic journals.

jd91mzm2 / LEGOlord208 (Samuel), 2021 (suspected suicide)

18 years, single

Article on Segment Fault mentions cyberbullying

He had created an enormous amount of code without being paid while still under 18

Near / Byuu / Dave, 2021 (confirmed suicide)

Developer of BSNES emulator, took an overdose, died in Japan.

Michael Anthony Bordlee, 2022 (suspected suicide)

Age 29, single

Artist, game designer at Parallax Visions, Museum of Virtual Art, open-source art.

Bordlee developed a VR simulator for depression after experiencing depression himself. He describes bullying taking a toll on him.

LinkedIn and an Obituary and tribute page where comments are invited.

Dr Alex Blewitt, Java, InfoQ, July 2022 (cause of death not disclosed)

InfoQ has published an obituary for Dr Blewitt. It is with great sadness that we announce that InfoQ editor Dr. Alex Blewitt has unexpectedly passed away.

Dr Blewitt resided in Milton Keynes and at some point the Milton Keynes coroner may clarify the cause of death.

Thiemo Seufer, Debian "community", 2008 (accident)

Mid 20s, single.

Thiemo Seufer was working late on Debian on 25 December, Christmas Day, 2008. In the early hours of the 26 December he died in a car accident. We do not have any official documents about the nature of this accident or who was at fault.

People who work late are 300 percent more likely to have a traffic accident.

Adrian von Bidder, Debian "community", 2011 (heart attack)

Married

Adrian had a heart attack in Switzerland at the time when the local developers were beginning their DebConf13 bid.

Elias Diem, FSFE, 2016 (accident)

39, single

Elias was spending time with an FSFE work colleague, Roman Willi, hiking, on a Saturday. He deviated from the trail and fell or slipped (German link).

Ich habe eine schlechte Nachricht mitzuteilen. Am Samstag Nachmittag war Elias mit unserem gemeinsamen Freund Roman Willi auf einer Wanderung in den Bündner Bergen. Beim Abstieg hat Elias auf einem kleinen Abschnitt, in der Grössenordung von 100 200 Meter, eine andere, kürzere Route gewählt als Roman. Es war keine besonders gefährliche Stelle. Steiler Grashang und einige Felsen. Elias ist auf seinem Weg abgestürzt. Als Roman bei ihm angekommen war hat er ihn bewusstlos aber atmend angetroffen. Da kein Handyempfang vorhanden war musste Roman auf dem nahen Wanderweg zur nächsten Alp rennen. Das hat ihn etwa eine viertel Stunde gekostet. Die Rega wurde alarmiert. Mit dem Bauernsohn sind sie zu zweit zurück zur Unfallstelle gespurtet. Kurz darauf war auch die Rega vor Ort. Die Rega Ärztin konnte leider nur noch den Tod feststellen. Elias hatte eine schwere Schädelfraktur und Hirnblutungen davongetragen. Er war mit Sicherheit sofort nach dem Aufprall bewusstlos.

Der genauere Ort war der Abstieg vom Glegghorn auf die Fläscheralp. Bergspezialisten von der Polizei waren heute Sonntag bei der Unglücksstelle. Ihr Befund war dass von oben, wo Elias eingestiegen ist, keine spezielle Gefahr zu erkennen war. Für einen erfahrenen Wanderer wie Elias war es in Ordnung diesen Weg zu wählen.

Informationen zu einem Abschied werden in wenigen Tagen folgen.

Sollte ich jemanden vergessen haben jemanden in den Email Verteiler aufzunehmen, bitte dieses Mail an den Betreffenden weiterleiten.

Bernd-Juergen Brandes, 2001 (volunteered to be eaten by Armin Meiwes)

Age 43, unmarried

Brandes and Meiwes were both IT professionals in Germany. We can not verify if they were involved with open source or the FSFE.

Meiwes posted a notice on the Internet asking for volunteers to be eaten. It sounds a lot like starting a new open source project.

The open source similarities don't stop there, Brandes and Meiwes decided to video record the whole encounter, a lot like the notorious photo evidence of wrongdoing at DebConf19.

Meiwes is in prison and won't be free again any time soon.

The cannibalism video has been lost in police archives. They are seeking volunteers for a re-enactment

Richard Stallman (RMS), FSF (his enemies dream of him having an accident)

When the lynch mob attacked Stallman in 2019, people were quick to question whether he faces the risk of suicide.

First they came, Debian, FSFE, Open Source, Code of Conduct

23 July, 2022 10:00AM

July 16, 2022

hackergotchi for Thomas Goirand

Thomas Goirand

My work during debcamp

I arrived in Prizren late on Wednesday. Here’s what I did during debcamp (so over 3 days). I hope this post just motivates others to contribute more to Debian.

At least 2 DDs want to upload packages that need a new version of python3-jsonschema (ie: version > 4.x). Unfortunately, version 4 broke a few packages. I therefore uploaded it to Experimental a few months/week, so I could see the result of autopkgtest reading the pseudo excuse page. And it showed a few packages broke. Here’s the one used (or part of) OpenStack:

  • Nova
  • Designate
  • Ironic
  • python-warlock
  • Sahara
  • Vitrage

Thanks to a reactive upstream, I was able to fix the first 4 above, but not Sahara yet. Vitrage poped-up when I uploade Debian release 2 of jsonschema, surprisingly. Also python3-jsonschema autopkgtest itself was broken because missing python3-pip in depends, but that should be fixed also.
I then filed bugs for packages not under my control:

  • bmtk
  • python-asdf

It looks tlike now there’s also spyder which wasn’t in the list a few hours ago. Maybe I should also file a bug against it. At this point, I don’t think the python-jsonschema transition is finished, but it’s on good tracks.

Then I also uploaded a new package of Ceph removing the ceph-mgr-diskprediction-local because it depended on python3-sklearn that the release team wanted to remove. I also prepared a point release update for it, but I’m currently waiting for the previous upload to migrate to testing before uploading the point release.

Last, I wrote the missing “update” command for extrepo, and pushed the merge request to Salsa. Now extrepo should be feature complete (at least from my point of view).

I also merged the patch for numberstation fixing the debian/copyright, and uploaded it to the NEW queue. It’s a new package that does 2 factor authentication, and is mobile friendly: it works perfectly on any Mobian powered phone.

Next, I intend to work with Arthur on the Cloud image finder. I hope we can find the time to work on it so it does what I need (ie: support the kind of setup I want to do, with HA, puppet, etc.).

16 July, 2022 08:22PM by Goirand Thomas

July 20, 2022

hackergotchi for Norbert Preining

Norbert Preining

Enrico Zini on DAM and “responsability”

The one single person within Debian who has worked for years to get me ostracized and thrown out of Debian is … Enrico Zini. Probably because I made a joke about him and his ridiculous statement “Debian is a relationship between multiple people” (how trivial can you be to be printed on a huge poster?), and me without knowing that his buddy Martina Ferrari is trans, criticizing them for spreading lies. Well … I should have known that doing this to a DAM (and back then also Anti-Harassment-Team member) could bring me into “devil’s kitchen”.

Funny to see what kind of head-banging creating concoction of talk Zini delivered to DebConf 2022. Obviously, no lesson learned, no reflection on their own failures to act properly. Always putting forth their private animosities over objective reasoning.

Another confirmation that Debian DAM (and CT) is as far from “data driven decision making” as …

Best greetings, one of your “troublesome people”

20 July, 2022 07:39AM by Norbert Preining

July 11, 2022

KDE/Plasma for Debian – Update 2022/7

(Update 2022-07-14: Plasma 5.25 now also available for Debian/stable)

Long time I haven’t posted a lot about KDE/Plasma for Debian, most people will know the reason. But I have anyway updated my repos at OBS, which now contain the latest releases of frameworks, gears, and plasma!

The status is as follows (all for Debian/stable, testing, unstable:

  • Frameworks: 5.96
  • Gears: 22.04.3
  • Plasma 5.25: 5.25.3
  • Plasma 5.24 LTS: 5.24.6

Unfortunately, compiling Plasma 5.25 didn’t work out due to dependencies on newer versions of Xorg and related libraries.

I repeat (and update) instructions for all here: First of all, you need to add my OBS key say in /etc/apt/trusted.gpg.d/obs-npreining.asc and add a file /etc/apt/sources.lists.d/obs-npreining-kde.list, containing the following lines, replacing the DISTRIBUTION part with one of Debian_11 (for Bullseye), Debian_Testing, or Debian_Unstable:

deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/other-deps/DISTRIBUTION/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/frameworks/DISTRIBUTION/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/plasma525/DISTRIBUTION/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/apps2204/DISTRIBUTION/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/other/DISTRIBUTION/ ./

Some programs in the other group have been recompiled against the Gears 22.04 libraries.

Enjoy!

Usual disclaimer: (1) Considering that I don’t have a user-facing Debian computer anymore, all these packages are only tested by third parties and not by myself. Be aware! (2) Funny to read the Debian Social Contract, Point 4. Our priorities are our users and free software, obviously I care a lot about my users, more than some other Debian members.

11 July, 2022 02:51AM by Norbert Preining

July 20, 2022

Enrico Zini

Deconstruction of the DAM hat

Further reading

Talk notes

Intro

  • I'm not speaking for the whole of DAM
  • Motivation in part is personal frustration, and need to set boundaries and negotiate expectations

Debian Account Managers

  • history

Responsibility for official membership

  • approve account creation
  • manage the New Member Process and nm.debian.org
  • close MIA accounts
  • occasional emergency termination of accounts
  • handle Emeritus
  • with lots of help from FrontDesk and MIA teams (big shoutout)

What DAM is not

  • we are not mediators
  • we are not a community management team
  • a list or IRC moderation team
  • we are not responsible for vision or strategic choices about how people are expected to interact in Debian
  • We shouldn't try and solve things because they need solving

Unexpected responsibilities

  • Over time, the community has grown larger and more complex, in a larger and more complex online environment
  • Enforcing the Diversity Statement and the Code of Conduct
  • Emergency list moderation
    • we have ended up using DAM warnings to compensate for the lack of list moderation, at least twice
  • contributors.debian.org (mostly only because of me, but it would be good to have its own team)

DAM warnings

  • except for rare glaring cases, patterns of behaviour / intentions / taking feedback in, are more relevant than individual incidents
  • we do not set out to fix people. It is enough for us to get people to acknowledge a problem
    • if they can't acknowledge a problem they're probably out
    • once a problem is acknowledged, fixing it could be their implementation detail
    • then again it's not that easy to get a number of troublesome people to acknowledge problems, so we go back to the problem of deciding when enough is enough

DAM warnings?

  • I got to a point where I look at DAM warnings as potential signals that DAM has ended up with the ball that everyone else in Debian dropped.
  • DAM warning means we haven't gotten to a last resort situation yet, meaning that it probably shouldn't be DAM dealing with this at this point
  • Everyone in the project can write a person "do you realise there's an issue here? Can you do something to stop?", and give them a chance to reflect on issues or ignore them, and build their reputation accordingly.
  • People in Debian should not have to endure, completey powerless, as trolls drag painful list discussions indefinitely until all the trolled people run out of energy and leave. At the same time, people who abuse a list should expect to be suspended or banned from the list, not have their Debian membership put into question (unless it is a recurring pattern of behaviour).
  • The push to grow DAM warnings as a tool, is a sign of the rest of Debian passing on their responsibilities, and DAM picking them up.
  • Then in DAM we end up passing on things, too, because we also don't have the energy to face another intensive megametathread, and as we take actions for things that shouldn't quite be our responsibility, we face a higher level of controversy, and therefore demotivation.
  • Also, as we take actions for things that shouldn't be our responsibility, and work on a higher level of controversy, our legitimacy is undermined (and understandably so)
    • there's a pothole on my street that never gets filled, so at some point I go out and fill it. Then people thank me, people complain I shouldn't have, people complain I didn't fill it right, people appreciate the gesture and invite me to learn how to fix potholes better, people point me out to more potholes, and then complain that potholes don't get fixed properly on the whole street. I end up being the problem, instead of whoever had responsibility of the potholes but wasn't fixing them
  • The Community Team, the Diversity Team, and individual developers, have no energy or entitlement for explaining what a healthy community looks like, and DAM is left with that responsibility in the form of accountability for their actions: to issue, say, a DAM warning for bullying, we are expected to explain what is bullying, and how that kind of behaviour constitutes bullying, in a way that is understandable by the whole project.
  • Since there isn't consensus in the project about what bullying loos like, we end up having to define it in a warning, which again is a responsibility we shouldn't have, and we need to do it because we have an escalated situation at hand, but we can't do it right

House rules

Interpreting house rules

  • you can't encode common sense about people behaviour in written rules: no matter how hard you try, people will find ways to cheat that
  • so one can use rules as a guideline, and someone responsible for the bits that can't go into rules.
    • context matters, privilege/oppression matters, patterns matter, histor matters
  • example:
    • call a person out for breaking a rule
    • get DARVO in response
    • state that DARVO is not acceptable
    • get concern trolling against margninalised people and accuse them of DARVO if they complain
  • example: assume good intentions vs enabling
  • example: rule lawyering and Figure skating
  • this cannot be solved by GRs: I/we (DAM)/possibly also we (Debian) don't want to do GRs about evaluating people

Governance by bullying

  • How to DoS discussions in Debian
    • example: gender, minority groups, affirmative action, inclusion, anything about the community team itself, anything about the CoC, systemd, usrmerge, dam warnings, expulsions
      • think of a topic. Think about sending a mail to debian-project about it. If you instinctively shiver at the thought, this is probably happening
      • would you send a mail about that to -project / -devel?
      • can you think of other topics?
    • it is an effective way of governance as it excludes topics from public discussion
  • A small number of people abuse all this, intentionally or not, to effectively manipulate decision making in the project.
  • Instead of using the rules of the community to bring forth the issues one cares about, it costs less energy to make it unthinkable or unbearable to have a discussion on issues one doesn't want to progress. What one can't stop constructively, one can oppose destructively.
  • even regularly diverting the discussion away from the original point or concern is enough to derail it without people realising you're doing it
  • This is an effective strategy for a few reckless people to unilaterally direct change, in the current state of Debian, at the cost of the health and the future of the community as a whole.
  • There are now a number of important issues nobody has the energy to discuss, because experience says that energy requirements to bring them to the foreground and deal with the consequences are anticipated to be disproportionate.
  • This is grave, as we're talking about trolling and bullying as malicious power moves to work around the accepted decision making structures of our community.
  • Solving this is out of scope for this talk, but it is urgent nevertheless, and can't be solved by expecting DAM to fix it

How about the Community Team?

  • It is also a small group of people who cannot pick up the responsibility of doing what the community isn't doing for itself
  • I believe we need to recover the Community Team: it's been years that every time they write something in public, they get bullied by the same recurring small group of people (see governance by bullying above)

How about DAM?

  • I was just saying that we are not the emergency catch all
  • When the only enforcement you have is "nuclear escalation", there's nothing you can do until it's too late, and meanwhile lots of people suffer (this was written before Russia invaded Ukraine)
  • Also, when issues happen on public lists, the BTS, or on IRC, some of the perpetrators are also outside of the jurisdiction of DAM, which shows how DAM is not the tool for this

How about the DPL?

  • Talking about emergency catch alls, don't they have enough to do already?

Concentrating responsibility

  • Concentrating all responsibility on social issues on a single point creates a scapegoat: we're blamed for any conduct issue, and we're blamed for any action we take on conduct issues
    • also, when you are a small group you are personally identified with it. Taking action on a person may mean making a new enemy, and becoming a target for harassment, retaliation, or even just the general unwarranted hostility of someone who is left with an axe to grind
  • As long as responsibility is centralised, any action one takes as a response of one micro-aggression (or one micro-aggression too many) is an overreaction. Distributing that responsibility allows a finer granularity of actions to be taken
    • you don't call the police to tell someone they're being annoying at the pub: the people at the pub will tell you you're being annoying, and the police is called if you want to beat them up in response
  • We are also a community where we have no tool to give feedback to posts, so it still looks good to nitpick stupid details with smart-looking tranchant one-liners, or elaborate confrontational put-downs, and one doesn't get the feedback of "that did not help". Compare with discussing https://salsa.debian.org/debian/grow-your-ideas/ which does have this kind of feedback
    • the lack of moderation and enforcement makes the Debian community ideal for easy baiting, concern trolling, dog whistling, and related fun, and people not empowered can be so manipulated to troll those responsible
    • if you're fragile in Debian, people will play cat and mouse with you. It might be social awkwardness, or people taking themselves too serious, but it can easily become bullying, and with no feedback it's hard to tell and course correct
  • Since DAM and DPL are where the ball stops, everyone else in Debian can afford to let the ball drop.
  • More generally, if only one group is responsible, nobody else is

Empowering developers

  • Police alone does not make a community safe: a community makes a community safe.
  • DDs currently have no power to act besides complaining to DAM, or complaining to Community Team that then can only pass complaints on to DAM.
    • you could act directly, but currently nobody has your back if the (micro-)aggression then starts extending to you, too
  • From no power comes no responsibility. And yet, the safety of a community is sustainable only if it is the responsibility of every member of the community.
  • don't wait for DAM as the only group who can do something
  • people should be able to address issues in smaller groups, without escalation at project level
  • but people don't have the tools for that
  • I/we've shouldered this responsibility for far too long because nobody else was doing it, and it's time the whole Debian community gets its act together and picks up this responsibility as they should be. You don't get to not care just because there's a small number of people who is caring for you.

What needs to happen

  • distinguish DAM decisions from decisions that are more about vision and direction, and would require more representation
  • DAM warnings shouldn't belong in DAM
  • who is responsible for interpretation of the CoC?
  • deciding what to do about controversial people shouldn't belong in DAM
  • curation of the community shouldn't belong in DAM
  • can't do this via GRs, it's a mess to do a GR to decide how acceptable is a specific person's behaviour, and a lot of this requires more and more frequent micro-decisions than one'd do via GRs

20 July, 2022 05:55AM

July 15, 2022

hackergotchi for Steve Kemp

Steve Kemp

So we come to Lisp

Recently I've been working with simple/trivial scripting languages, and I guess I finally reached a point where I thought "Lisp? Why not". One of the reasons for recent experimentation was thinking about the kind of minimalism that makes implementing a language less work - being able to actually use the language to write itself.

FORTH is my recurring example, because implementing it mostly means writing a virtual machine which consists of memory ("cells") along with a pair of stacks, and some primitives for operating upon them. Once you have that groundwork in place you can layer the higher-level constructs (such as "for", "if", etc).

Lisp allows a similar approach, albeit with slightly fewer low-level details required, and far less tortuous thinking. Lisp always feels higher-level to me anyway, given the explicit data-types ("list", "string", "number", etc).

Here's something that works in my toy lisp:

;; Define a function, `fact`, to calculate factorials (recursively).
(define fact (lambda (n)
  (if (<= n 1)
    1
      (* n (fact (- n 1))))))

;; Invoke the factorial function, using apply
(apply (list 1 2 3 4 5 6 7 8 9 10)
  (lambda (x)
    (print "%s! => %s" x (fact x))))

The core language doesn't have helpful functions to filter lists, or build up lists by applying a specified function to each member of a list, but adding them is trivial using the standard car, cdr, and simple recursion. That means you end up writing lots of small functions like this:

(define zero? (lambda (n) (if (= n 0) #t #f)))
(define even? (lambda (n) (if (zero? (% n 2)) #t #f)))
(define odd?  (lambda (n) (! (even? n))))
(define sq    (lambda (x) (* x x)))

Once you have them you can use them in a way that feels simple and natural:

(print "Even numbers from 0-10: %s"
  (filter (nat 11) (lambda (x) (even? x))))

(print "Squared numbers from 0-10: %s"
  (map (nat 11) (lambda (x) (sq x))))

This all feels very sexy and simple, because the implementations of map, apply, filter are all written using the lisp - and they're easy to write.

Lisp takes things further than some other "basic" languages because of the (infamous) support for Macros. But even without them writing new useful functions is pretty simple. Where things struggle? I guess I don't actually have a history of using lisp to actually solve problems - although it's great for configuring my editor..

Anyway I guess the journey continues. Having looked at the obvious "minimal core" languages I need to go further afield:

I'll make an attempt to look at some of the esoteric programming languages, and see if any of those are fun to experiment with.

15 July, 2022 01:01AM

July 18, 2022

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

Unsubscribe

Remove me, kthxbye

18 July, 2022 05:01AM

July 20, 2022

Antoine Beaupré

Relaying mail through debian.org

Back in 2020, I wrote this article about using DKIM to sign outgoing debian.org mail. This worked well for me for a while: outgoing mail was signed with DKIM and somehow was delivered. Maybe. Who knows.

But now we have a relay server which makes this kind of moot. So I have changed my configuration to use that relay instead of sending email on my own. It seems more reliable that mail seems to be coming from a real debian.org machine, so I'm hoping this will have better reputation than my current setup.

In general, you should follow the DSA documentation which includes a Postfix configuration. In my case, it was basically this patch:

diff --git a/postfix/main.cf b/postfix/main.cf
index 7fe6dd9e..eabe714a 100644
--- a/postfix/main.cf
+++ b/postfix/main.cf
@@ -55,3 +55,4 @@ smtp_sasl_security_options =
 smtp_sender_dependent_authentication = yes
 sender_dependent_relayhost_maps = hash:/etc/postfix/sender_relay
 sender_dependent_default_transport_maps = hash:/etc/postfix/sender_transport
+smtp_tls_policy_maps = hash:/etc/postfix/tls_policy
diff --git a/postfix/sender_relay b/postfix/sender_relay
index b486d687..997cce19 100644
--- /dev/null
+++ b/postfix/sender_relay
@@ -0,0 +1,2 @@
+# Per-sender provider; see also /etc/postfix/sasl_passwd.
+@debian.org    [mail-submit.debian.org]:submission
diff --git a/postfix/sender_transport b/postfix/sender_transport
index ca69bc7a..c506c1fc 100644
--- /dev/null
+++ b/postfix/sender_transport
@@ -0,0 +1,1 @@
+anarcat@debian.org     smtp:
diff --git a/postfix/tls_policy b/postfix/tls_policy
new file mode 100644
index 00000000..9347921a
--- /dev/null
+++ b/postfix/tls_policy
@@ -0,0 +1,1 @@
+submission.torproject.org:submission   verify ciphers=high

This configuration differs from the one provided by DSA because I already had the following configured:

sender_dependent_relayhost_maps = hash:/etc/postfix/sender_relay
smtp_sender_dependent_authentication = yes
smtp_sasl_auth_enable = yes
smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd
smtp_sasl_tls_security_options = noanonymous

I also don't show the patch on /etc/postfix/sasl_passwd for obvious security reasons.

I also had to setup a tls_policy map, because I couldn't use dane for all my remotes. You'll notice I also had to setup a sender_transport because I use a non-default default_transport as well.

It also seems like you can keep the previous DKIM configuration in parallel with this one, as long as you don't double-sign outgoing mail. Since this configuration here is done on my mail client (i.e. not on the server where I am running OpenDKIM), I'm not double-signing so I left the DKIM configuration alone. But if I wanted to remove it, the magic command is:

echo "del dkimPubKey" | gpg --clearsign | mail changes@db.debian.org

20 July, 2022 05:22PM

November 05, 2012

hackergotchi for Alberto García

Alberto García

Igalia at LinuxCon Europe

I came to Barcelona with a few other Igalians this week for LinuxCon, the Embedded
Linux Conference
and the KVM Forum.

We are sponsoring the event and we have a couple of presentations this year, one about QEMU, device drivers and industrial hardware (which I gave today, slides here) and the other about the Grilo multimedia framework (by Juan Suárez).

We’ll be around the whole week so you can come and talk to us anytime. You can find us at our booth on the ground floor, where you’ll also be able to see a few demos of our latest work and get some merchandising.

Igalia booth

05 November, 2012 10:08PM by berto

July 05, 2022

Running the Steam Deck’s OS in a virtual machine using QEMU

SteamOS desktop

Introduction

The Steam Deck is a handheld gaming computer that runs a Linux-based operating system called SteamOS. The machine comes with SteamOS 3 (code name “holo”), which is in turn based on Arch Linux.

Although there is no SteamOS 3 installer for a generic PC (yet), it is very easy to install on a virtual machine using QEMU. This post explains how to do it.

The goal of this VM is not to play games (you can already install Steam on your computer after all) but to use SteamOS in desktop mode. The Gamescope mode (the console-like interface you normally see when you use the machine) requires additional development to make it work with QEMU and will not work with these instructions.

A SteamOS VM can be useful for debugging, development, and generally playing and tinkering with the OS without risking breaking the Steam Deck.

Running the SteamOS desktop in a virtual machine only requires QEMU and the OVMF UEFI firmware and should work in any relatively recent distribution. In this post I’m using QEMU directly, but you can also use virt-manager or some other tool if you prefer, we’re emulating a standard x86_64 machine here.

General concepts

SteamOS is a single-user operating system and it uses an A/B partition scheme, which means that there are two sets of partitions and two copies of the operating system. The root filesystem is read-only and system updates happen on the partition set that is not active. This allows for safer updates, among other things.

There is one single /home partition, shared by both partition sets. It contains the games, user files, and anything that the user wants to install there.

Although the user can trivially become root, make the root filesystem read-write and install or change anything (the pacman package manager is available), this is not recommended because

  • it increases the chances of breaking the OS, and
  • any changes will disappear with the next OS update.

A simple way for the user to install additional software that survives OS updates and doesn’t touch the root filesystem is Flatpak. It comes preinstalled with the OS and is integrated with the KDE Discover app.

Preparing all necessary files

The first thing that we need is the installer. For that we have to download the Steam Deck recovery image from here: https://store.steampowered.com/steamos/download/?ver=steamdeck&snr=

Once the file has been downloaded, we can uncompress it and we’ll get a raw disk image called steamdeck-recovery-4.img (the number may vary).

Note that the recovery image is already SteamOS (just not the most up-to-date version). If you simply want to have a quick look you can play a bit with it and skip the installation step. In this case I recommend that you extend the image before using it, for example with ‘truncate -s 64G steamdeck-recovery-4.img‘ or, better, create a qcow2 overlay file and leave the original raw image unmodified: ‘qemu-img create -f qcow2 -F raw -b steamdeck-recovery-4.img steamdeck-recovery-extended.qcow2 64G

But here we want to perform the actual installation, so we need a destination image. Let’s create one:

$ qemu-img create -f qcow2 steamos.qcow2 64G

Installing SteamOS

Now that we have all files we can start the virtual machine:

$ qemu-system-x86_64 -enable-kvm -smp cores=4 -m 8G \
    -device usb-ehci -device usb-tablet \
    -device intel-hda -device hda-duplex \
    -device VGA,xres=1280,yres=800 \
    -drive if=pflash,format=raw,readonly=on,file=/usr/share/ovmf/OVMF.fd \
    -drive if=virtio,file=steamdeck-recovery-4.img,driver=raw \
    -device nvme,drive=drive0,serial=badbeef \
    -drive if=none,id=drive0,file=steamos.qcow2

Note that we’re emulating an NVMe drive for steamos.qcow2 because that’s what the installer script expects. This is not strictly necessary but it makes things a bit easier. If you don’t want to do that you’ll have to edit ~/tools/repair_device.sh and change DISK and DISK_SUFFIX.

SteamOS installer shortcuts

Once the system has booted we’ll see a KDE Plasma session with a few tools on the desktop. If we select “Reimage Steam Deck” and click “Proceed” on the confirmation dialog then SteamOS will be installed on the destination drive. This process should not take a long time.

Now, once the operation finishes a new confirmation dialog will ask if we want to reboot the Steam Deck, but here we have to choose “Cancel”. We cannot use the new image yet because it would try to boot into the Gamescope session, which won’t work, so we need to change the default desktop session.

SteamOS comes with a helper script that allows us to enter a chroot after automatically mounting all SteamOS partitions, so let’s open a Konsole and make the Plasma session the default one in both partition sets:

$ sudo steamos-chroot --disk /dev/nvme0n1 --partset A
# steamos-readonly disable
# echo '[Autologin]' > /etc/sddm.conf.d/zz-steamos-autologin.conf
# echo 'Session=plasma.desktop' >> /etc/sddm.conf.d/zz-steamos-autologin.conf
# steamos-readonly enable
# exit

$ sudo steamos-chroot --disk /dev/nvme0n1 --partset B
# steamos-readonly disable
# echo '[Autologin]' > /etc/sddm.conf.d/zz-steamos-autologin.conf
# echo 'Session=plasma.desktop' >> /etc/sddm.conf.d/zz-steamos-autologin.conf
# steamos-readonly enable
# exit

After this we can shut down the virtual machine. Our new SteamOS drive is ready to be used. We can discard the recovery image now if we want.

Booting SteamOS and first steps

To boot SteamOS we can use a QEMU line similar to the one used during the installation. This time we’re not emulating an NVMe drive because it’s no longer necessary.

$ cp /usr/share/OVMF/OVMF_VARS.fd .
$ qemu-system-x86_64 -enable-kvm -smp cores=4 -m 8G \
   -device usb-ehci -device usb-tablet \
   -device intel-hda -device hda-duplex \
   -device VGA,xres=1280,yres=800 \
   -drive if=pflash,format=raw,readonly=on,file=/usr/share/ovmf/OVMF.fd \
   -drive if=pflash,format=raw,file=OVMF_VARS.fd \
   -drive if=virtio,file=steamos.qcow2 \
   -device virtio-net-pci,netdev=net0 \
   -netdev user,id=net0,hostfwd=tcp::2222-:22

(the last two lines redirect tcp port 2222 to port 22 of the guest to be able to SSH into the VM. If you don’t want to do that you can omit them)

If everything went fine, you should see KDE Plasma again, this time with a desktop icon to launch Steam and another one to “Return to Gaming Mode” (which we should not use because it won’t work). See the screenshot that opens this post.

Congratulations, you’re running SteamOS now. Here are some things that you probably want to do:

  • (optional) Change the keyboard layout in the system settings (the default one is US English)
  • Set the password for the deck user: run ‘passwd‘ on a terminal
  • Enable / start the SSH server: ‘sudo systemctl enable sshd‘ and/or ‘sudo systemctl start sshd‘.
  • SSH into the machine: ‘ssh -p 2222 deck@localhost

Updating the OS to the latest version

The Steam Deck recovery image doesn’t install the most recent version of SteamOS, so now we should probably do a software update.

  • First of all ensure that you’re giving enought RAM to the VM (in my examples I run QEMU with -m 8G). The OS update might fail if you use less.
  • (optional) Change the OS branch if you want to try the beta release: ‘sudo steamos-select-branch beta‘ (or main, if you want the bleeding edge)
  • Check the currently installed version in /etc/os-release (see the BUILD_ID variable)
  • Check the available version: ‘steamos-update check
  • Download and install the software update: ‘steamos-update

Note: if the last step fails after reaching 100% with a post-install handler error then go to Connections in the system settings, rename Wired Connection 1 to something else (anything, the name doesn’t matter), click Apply and run steamos-update again. This works around a bug in the update process. Recent images fix this and this workaround is not necessary with them.

As we did with the recovery image, before rebooting we should ensure that the new update boots into the Plasma session, otherwise it won’t work:

$ sudo steamos-chroot --partset other
# steamos-readonly disable
# echo '[Autologin]' > /etc/sddm.conf.d/zz-steamos-autologin.conf
# echo 'Session=plasma.desktop' >> /etc/sddm.conf.d/zz-steamos-autologin.conf
# steamos-readonly enable
# exit

After this we can restart the system.

If everything went fine we should be running the latest SteamOS release. Enjoy!

Reporting bugs

SteamOS is under active development. If you find problems or want to request improvements please go to the SteamOS community tracker.

Edit 06 Jul 2022: Small fixes, mention how to install the OS without using NVMe.

05 July, 2022 07:11PM by berto

December 03, 2020

Subcluster allocation for qcow2 images

In previous blog posts I talked about QEMU’s qcow2 file format and how to make it faster. This post gives an overview of how the data is structured inside the image and how that affects performance, and this presentation at KVM Forum 2017 goes further into the topic.

This time I will talk about a new extension to the qcow2 format that seeks to improve its performance and reduce its memory requirements.

Let’s start by describing the problem.

Limitations of qcow2

One of the most important parameters when creating a new qcow2 image is the cluster size. Much like a filesystem’s block size, the qcow2 cluster size indicates the minimum unit of allocation. One difference however is that while filesystems tend to use small blocks (4 KB is a common size in ext4, ntfs or hfs+) the standard qcow2 cluster size is 64 KB. This adds some overhead because QEMU always needs to write complete clusters so it often ends up doing copy-on-write and writing to the qcow2 image more data than what the virtual machine requested. This gets worse if the image has a backing file because then QEMU needs to copy data from there, so a write request not only becomes larger but it also involves additional read requests from the backing file(s).

Because of that qcow2 images with larger cluster sizes tend to:

  • grow faster, wasting more disk space and duplicating data.
  • increase the amount of necessary I/O during cluster allocation,
    reducing the allocation performance.

Unfortunately, reducing the cluster size is in general not an option because it also has an impact on the amount of metadata used internally by qcow2 (reference counts, guest-to-host cluster mapping). Decreasing the cluster size increases the number of clusters and the amount of necessary metadata. This has direct negative impact on I/O performance, which can be mitigated by caching it in RAM, therefore increasing the memory requirements (the aforementioned post covers this in more detail).

Subcluster allocation

The problems described in the previous section are well-known consequences of the design of the qcow2 format and they have been discussed over the years.

I have been working on a way to improve the situation and the work is now finished and available in QEMU 5.2 as a new extension to the qcow2 format called extended L2 entries.

The so-called L2 tables are used to map guest addresses to data clusters. With extended L2 entries we can store more information about the status of each data cluster, and this allows us to have allocation at the subcluster level.

The basic idea is that data clusters are now divided into 32 subclusters of the same size, and each one of them can be allocated separately. This allows combining the benefits of larger cluster sizes (less metadata and RAM requirements) with the benefits of smaller units of allocation (less copy-on-write, smaller images). If the subcluster size matches the block size of the filesystem used inside the virtual machine then we can eliminate the need for copy-on-write entirely.

So with subcluster allocation we get:

  • Sixteen times less metadata per unit of allocation, greatly reducing the amount of necessary L2 cache.
  • Much faster I/O during allocation when the image has a backing file, up to 10-15 times more I/O operations per second for the same cluster size in my tests (see chart below).
  • Smaller images and less duplication of data.

This figure shows the average number of I/O operations per second that I get with 4KB random write requests to an empty 40GB image with a fully populated backing file.

I/O performance comparison between traditional and extended qcow2 images

Things to take into account:

  • The performance improvements described earlier happen during allocation. Writing to already allocated (sub)clusters won’t be any faster.
  • If the image does not have a backing file chances are that the allocation performance is equally fast, with or without extended L2 entries. This depends on the filesystem, so it should be tested before enabling this feature (but note that the other benefits mentioned above still apply).
  • Images with extended L2 entries are sparse, that is, they have holes and because of that their apparent size will be larger than the actual disk usage.
  • It is not recommended to enable this feature in compressed images, as compressed clusters cannot take advantage of any of the benefits.
  • Images with extended L2 entries cannot be read with older versions of QEMU.

How to use this?

Extended L2 entries are available starting from QEMU 5.2. Due to the nature of the changes it is unlikely that this feature will be backported to an earlier version of QEMU.

In order to test this you simply need to create an image with extended_l2=on, and you also probably want to use a larger cluster size (the default is 64 KB, remember that every cluster has 32 subclusters). Here is an example:

$ qemu-img create -f qcow2 -o extended_l2=on,cluster_size=128k img.qcow2 1T

And that’s all you need to do. Once the image is created all allocations will happen at the subcluster level.

More information

This work was presented at the 2020 edition of the KVM Forum. Here is the video recording of the presentation, where I cover all this in more detail:

You can also find the slides here.

Acknowledgments

This work has been possible thanks to Outscale, who have been sponsoring Igalia and my work in QEMU.

Igalia and Outscale

And thanks of course to the rest of the QEMU development team for their feedback and help with this!

03 December, 2020 06:15PM by berto

November 16, 2017

“Improving the performance of the qcow2 format” at KVM Forum 2017

I was in Prague last month for the 2017 edition of the KVM Forum. There I gave a talk about some of the work that I’ve been doing this year to improve the qcow2 file format used by QEMU for storing disk images. The focus of my work is to make qcow2 faster and to reduce its memory requirements.

The video of the talk is now available and you can get the slides here.

The KVM Forum was co-located with the Open Source Summit and the Embedded Linux Conference Europe. Igalia was sponsoring both events one more year and I was also there together with some of my colleages. Juanjo Sánchez gave a talk about WPE, the WebKit port for embedded platforms that we released.

The video of his talk is also available.

16 November, 2017 10:16AM by berto

August 26, 2019

The status of WebKitGTK in Debian

Like all other major browser engines, WebKit is a project that evolves very fast with releases every few weeks containing new features and security fixes.

WebKitGTK is available in Debian under the webkit2gtk name, and we are doing our best to provide the most up-to-date packages for as many users as possible.

I would like to give a quick summary of the status of WebKitGTK in Debian: what you can expect and where you can find the packages.

  • Debian unstable (sid): The most recent stable version of WebKitGTK (2.24.3 at the time of writing) is always available in Debian unstable, typically on the same day of the upstream release.
  • Debian testing (bullseye): If no new bugs are found, that same version will be available in Debian testing a few days later.
  • Debian stable (buster): WebKitGTK is covered by security support for the first time in Debian buster, so stable releases that contain security fixes will be made available through debian-security. The upstream dependencies policy guarantees that this will be possible during the buster lifetime. Apart from security updates, users of Debian buster will get newer packages during point releases.
  • Debian experimental: The most recent development version of WebKitGTK (2.25.4 at the time of writing) is always available in Debian experimental.

In addition to that, the most recent stable versions are also available as backports.

  • Debian stable (buster): Users can get the most recent stable releases of WebKitGTK from buster-backports, usually a couple of days after they are available in Debian testing.
  • Debian oldstable (stretch): While possible we are also providing backports for stretch using stretch-backports-sloppy. Due to older or missing dependencies some features may be disabled when compared to the packages in buster or testing.

You can also find a table with an overview of all available packages here.

One last thing: as explained on the release notes, users of i386 CPUs without SSE2 support will have problems with the packages available in Debian buster (webkit2gtk 2.24.2-1). This problem has already been corrected in the packages available in buster-backports or in the upcoming point release.

26 August, 2019 01:13PM by berto

June 21, 2022

hackergotchi for Louis-Philippe Véronneau

Louis-Philippe Véronneau

Montreal's Debian & Stuff - June 2022

As planned, we held our second local Debian meeting of the year last Sunday. We met at the lovely Eastern Bloc (an artists' hacklab) to work on Debian (and other stuff!), chat and socialise.

Although there were fewer people than at our last meeting1, we still did a lot of work!

I worked on fixing a bunch of bugs in Clojure packages2, LeLutin worked on podman and packaged libinfluxdb-http-perl and anarcat worked on internetarchive, trocla and moneta. Olivier also came by and worked on debugging his Kali install.

We are planning to have our next meeting at the end of August. If you are interested, the best way to stay in touch is either to subscribe to our mailing list or to join our IRC channel (#debian-quebec on OFTC). Events are also posted on Quebec's Agenda du libre.

Many thanks to Debian for providing us a budget to rent the venue for the day and for the pizza! Here is a nice picture anarcat took of (one of) the glasses of porter we had afterwards, at the next door brewery:

A glass of English Porter from Silo Brewery


  1. Summer meetings are always less populous and it also happened to be Father's Day... 

  2. #1012824, #1011856, #1011837, #1011844, #1011864 and #1011967

21 June, 2022 05:37PM by Louis-Philippe Véronneau

June 25, 2022

Ryan Kavanagh

Routable network addresses with OpenIKED and systemd-networkd

I’ve been using OpenIKED for some time now to configure my VPN. One of its features is that it can dynamically assign addresses on the internal network to clients, and clients can assign these addresses and routes to interfaces. However, these interfaces must exist before iked can start. Some months ago I switched my Debian laptop’s configuration from the traditional ifupdown to systemd-networkd. It took me some time to figure out how to have systemd-networkd create dummy interfaces on which iked can install addresses, but also not interfere with iked by trying to manage these interfaces. Here is my working configuration.

First, I have systemd create the interface dummy1 by creating a systemd.netdev(5) configuration file at /etc/systemd/network/20-dummy1.netdev:

[NetDev]
Name=dummy1
Kind=dummy 

Then I tell systemd not to manage this interface by creating a systemd.network(5) configuration file at /etc/systemd/network/20-dummy1.network:

[Match]
Name=dummy1
Unmanaged=yes

Restarting systemd-networkd causes these interfaces to get created, and we can then check their status using networkctl(8):

$ systemctl restart systemd-networkd.service
$ networkctl
IDX LINK     TYPE     OPERATIONAL SETUP
  1 lo       loopback carrier     unmanaged
  2 enp2s0f0 ether    off         unmanaged
  3 enp5s0   ether    off         unmanaged
  4 dummy1   ether    degraded    configuring
  5 dummy3   ether    degraded    configuring
  6 sit0     sit      off         unmanaged
  8 wlp3s0   wlan     routable    configured
  9 he-ipv6  sit      routable    configured

8 links listed.

Finally, I configure my flows in /etc/iked.conf, making sure to assign the received address to the interface dummy1.

ikev2 'hades' active esp \
        from dynamic to 10.0.1.0/24 \
        peer hades.rak.ac \
        srcid '/CN=asteria.rak.ac' \
        dstid '/CN=hades.rak.ac' \
        request address 10.0.1.103 \
        iface dummy1

Restarting openiked and checking the status of the interface reveals that it has been assigned an address on the internal network and that it is routable:

$ systemctl restart openiked.service
$ networkctl status dummy1
â—� 4: dummy1
                     Link File: /usr/lib/systemd/network/99-default.link
                  Network File: /etc/systemd/network/20-dummy1.network
                          Type: ether
                          Kind: dummy
                         State: routable (configured)
                  Online state: online
                        Driver: dummy
              Hardware Address: 22:50:5f:98:a1:a9
                           MTU: 1500
                         QDisc: noqueue
  IPv6 Address Generation Mode: eui64
          Queue Length (Tx/Rx): 1/1
                       Address: 10.0.1.103
                                fe80::2050:5fff:fe98:a1a9
                           DNS: 10.0.1.1
                 Route Domains: .
             Activation Policy: up
           Required For Online: yes
             DHCP6 Client DUID: DUID-EN/Vendor:0000ab11aafa4f02d6ac68d40000

I’d be happy to hear if there are simpler or more idiomatic ways to configure this under systemd.

25 June, 2022 11:41AM

June 24, 2022

hackergotchi for Kees Cook

Kees Cook

finding binary differences

As part of the continuing work to replace 1-element arrays in the Linux kernel, it’s very handy to show that a source change has had no executable code difference. For example, if you started with this:

struct foo {
    unsigned long flags;
    u32 length;
    u32 data[1];
};

void foo_init(int count)
{
    struct foo *instance;
    size_t bytes = sizeof(*instance) + sizeof(u32) * (count - 1);
    ...
    instance = kmalloc(bytes, GFP_KERNEL);
    ...
};

And you changed only the struct definition:

-    u32 data[1];
+    u32 data[];

The bytes calculation is going to be incorrect, since it is still subtracting 1 element’s worth of space from the desired count. (And let’s ignore for the moment the open-coded calculation that may end up with an arithmetic over/underflow here; that can be solved separately by using the struct_size() helper or the size_mul(), size_add(), etc family of helpers.)

The missed adjustment to the size calculation is relatively easy to find in this example, but sometimes it’s much less obvious how structure sizes might be woven into the code. I’ve been checking for issues by using the fantastic diffoscope tool. It can produce a LOT of noise if you try to compare builds without keeping in mind the issues solved by reproducible builds, with some additional notes. I prepare my build with the “known to disrupt code layout” options disabled, but with debug info enabled:

$ KBF="KBUILD_BUILD_TIMESTAMP=1970-01-01 KBUILD_BUILD_USER=user KBUILD_BUILD_HOST=host KBUILD_BUILD_VERSION=1"
$ OUT=gcc
$ make $KBF O=$OUT allmodconfig
$ ./scripts/config --file $OUT/.config \
        -d GCOV_KERNEL -d KCOV -d GCC_PLUGINS -d IKHEADERS -d KASAN -d UBSAN \
        -d DEBUG_INFO_NONE -e DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
$ make $KBF O=$OUT olddefconfig

Then I build a stock target, saving the output in “before”. In this case, I’m examining drivers/scsi/megaraid/:

$ make -jN $KBF O=$OUT drivers/scsi/megaraid/
$ mkdir -p $OUT/before
$ cp $OUT/drivers/scsi/megaraid/*.o $OUT/before/

Then I patch and build a modified target, saving the output in “after”:

$ vi the/source/code.c
$ make -jN $KBF O=$OUT drivers/scsi/megaraid/
$ mkdir -p $OUT/after
$ cp $OUT/drivers/scsi/megaraid/*.o $OUT/after/

And then run diffoscope:

$ diffoscope $OUT/before/ $OUT/after/

If diffoscope output reports nothing, then we’re done. 🥳

Usually, though, when source lines move around other stuff will shift too (e.g. WARN macros rely on line numbers, so the bug table may change contents a bit, etc), and diffoscope output will look noisy. To examine just the executable code, the command that diffoscope used is reported in the output, and we can run it directly, but with possibly shifted line numbers not reported. i.e. running objdump without --line-numbers:

$ ARGS="--disassemble --demangle --reloc --no-show-raw-insn --section=.text"
$ for i in $(cd $OUT/before && echo *.o); do
        echo $i
        diff -u <(objdump $ARGS $OUT/before/$i | sed "0,/^Disassembly/d") \
                <(objdump $ARGS $OUT/after/$i  | sed "0,/^Disassembly/d")
done

If I see an unexpected difference, for example:

-    c120:      movq   $0x0,0x800(%rbx)
+    c120:      movq   $0x0,0x7f8(%rbx)

Then I'll search for the pattern with line numbers added to the objdump output:

$ vi <(objdump --line-numbers $ARGS $OUT/after/megaraid_sas_fp.o)

I'd search for "0x0,0x7f8", find the source file and line number above it, open that source file at that position, and look to see where something was being miscalculated:

$ vi drivers/scsi/megaraid/megaraid_sas_fp.c +329

Once tracked down, I'd start over at the "patch and build a modified target" step above, repeating until there were no differences. For example, in the starting example, I'd also need to make this change:

-    size_t bytes = sizeof(*instance) + sizeof(u32) * (count - 1);
+    size_t bytes = sizeof(*instance) + sizeof(u32) * count;

Though, as hinted earlier, better yet would be:

-    size_t bytes = sizeof(*instance) + sizeof(u32) * (count - 1);
+    size_t bytes = struct_size(instance, data, count);

But sometimes adding the helper usage will add binary output differences since they're performing overflow checking that might saturate at SIZE_MAX. To help with patch clarity, those changes can be done separately from fixing the array declaration.

© 2022, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.
CC BY-SA 4.0

24 June, 2022 08:11PM by kees

June 28, 2022

Dima Kogan

vnlog 1.33 released

This is a minor release to the vnlog toolkit that adds a few convenience options to the vnl-filter tool. The new options are

vnl-filter -l

Prints out the existing columns, and exits. I've been low-level wanting this for years, but never acutely-enough to actually write it. Today I finally did it.

vnl-filter --sub-abs

Defines an absolute-value abs() function in the default awk mode. I've been low-level wanting this for years as well. Previously I'd use --perl just to get abs(), or I'd explicitly define it: =–sub 'abs(x) {return x>0?x:-x;}'=. Typing all that out was becoming tiresome, and now I don't need to anymore.

vnl-filter --begin ... and vnl-filter --end ...

Theses add BEGIN and END clauses. They're useful to, for instance, use a perl module in BEGIN, or to print out some final output in END. Previously you'd add these inside the --eval block, but that was awkward because BEGIN and END would then appear inside the while(<>) { } loop. And there was no clear was to do it in the normal -p mode (no --eval).

Clearly these are all minor, since the toolkit is now mature. It does everything I want it to, that doesn't require lots of work to implement. The big missing features that I want would patch the underlying GNU coreutils instead of vnlog:

  • The sort tool can select different sorting modes, but join works only with alphanumeric sorting. join should have similarly selectable sorting modes. In the vnlog wrappe I can currently do something like vnl-join --vnl-sort n. This would pre-sort the input alphanumerically, and then post-sort it numerically. That is slow for big datasets. If join could handle numerically-sorted data directly, neither the pre- or post-sorts would be needed
  • When joining on a numerical field, join should be able to do some sort of interpolation when given fields that don't match exactly.

Both of these probably wouldn't take a ton of work to implement, and I'll look into it someday.

28 June, 2022 04:47PM by Dima Kogan

July 01, 2022

hackergotchi for Ben Hutchings

Ben Hutchings

Debian LTS work, June 2022

In June I was not assigned additional hours of work by Freexian's Debian LTS initiative, but carried over 16 hours from May and worked all of those hours.

I spent some time triaging security issues for Linux. I tested several security fixes for Linux 4.9 and 4.19 and submitted them for inclusion in the upstream stable branches.

I rebased the Linux 4.9 (linux) package on the latest stable update (4.9.320), uploaded this and issued the final DLA for stretch, DLA-3065-1.

01 July, 2022 01:12PM

June 21, 2022

hackergotchi for Steve Kemp

Steve Kemp

Writing a simple TCL interpreter in golang

Recently I was reading Antirez's piece TCL the Misunderstood again, which is a nice defense of the utility and value of the TCL language.

TCL is one of those scripting languages which used to be used a hell of a lot in the past, for scripting routers, creating GUIs, and more. These days it quietly lives on, but doesn't get much love. That said it's a remarkably simple language to learn, and experiment with.

Using TCL always reminds me of FORTH, in the sense that the syntax consists of "words" with "arguments", and everything is a string (well, not really, but almost. Some things are lists too of course).

A simple overview of TCL would probably begin by saying that everything is a command, and that the syntax is very free. There are just a couple of clever rules which are applied consistently to give you a remarkably flexible environment.

To get started we'll set a string value to a variable:

  set name "Steve Kemp"
  => "Steve Kemp"

Now you can output that variable:

  puts "Hello, my name is $name"
  => "Hello, my name is Steve Kemp"

OK, it looks a little verbose due to the use of set, and puts is less pleasant than print or echo, but it works. It is readable.

Next up? Interpolation. We saw how $name expanded to "Steve Kemp" within the string. That's true more generally, so we can do this:

 set print pu
 set me    ts

 $print$me "Hello, World"
 => "Hello, World"

There "$print" and "$me" expanded to "pu" and "ts" respectively. Resulting in:

 puts "Hello, World"

That expansion happened before the input was executed, and works as you'd expect. There's another form of expansion too, which involves the [ and ] characters. Anything within the square-brackets is replaced with the contents of evaluating that body. So we can do this:

 puts "1 + 1 = [expr 1 + 1]"
 => "1 + 1 = 2"

Perhaps enough detail there, except to say that we can use { and } to enclose things that are NOT expanded, or executed, at parse time. This facility lets us evaluate those blocks later, so you can write a while-loop like so:

 set cur 1
 set max 10

 while { expr $cur <= $max } {
       puts "Loop $cur of $max"
       incr cur
 }

Anyway that's enough detail. Much like writing a FORTH interpreter the key to implementing something like this is to provide the bare minimum of primitives, then write the rest of the language in itself.

You can get a usable scripting language with only a small number of the primitives, and then evolve the rest yourself. Antirez also did this, he put together a small TCL interpreter in C named picol:

Other people have done similar things, recently I saw this writeup which follows the same approach:

So of course I had to do the same thing, in golang:

My code runs the original code from Antirez with only minor changes, and was a fair bit of fun to put together.

Because the syntax is so fluid there's no complicated parsing involved, and the core interpreter was written in only a few hours then improved step by step.

Of course to make a language more useful you need I/O, beyond just writing to the console - and being able to run the list-operations would make it much more useful to TCL users, but that said I had fun writing it, it seems to work, and once again I added fuzz-testers to the lexer and parser to satisfy myself it was at least somewhat robust.

Feedback welcome, but even in quiet isolation it's fun to look back at these "legacy" languages and recognize their simplicity lead to a lot of flexibility.

21 June, 2022 01:00PM

July 01, 2022

An update on my simple golang TCL interpreter

So my previous post introduced a trivial interpreter for a TCL-like language.

In the past week or two I've cleaned it up, fixed a bunch of bugs, and added 100% test-coverage. I'm actually pretty happy with it now.

One of the reasons for starting this toy project was to experiment with how easy it is to extend the language using itself

Some things are simple, for example replacing this:

puts "3 x 4 = [expr 3 * 4]"

With this:

puts "3 x 4 = [* 3 4]"

Just means defining a function (proc) named *. Which we can do like so:

proc * {a b} {
    expr $a * $b
}

(Of course we don't have lists, or variadic arguments, so this is still a bit of a toy example.)

Doing more than that is hard though without support for more primitives written in the parent language than I've implemented. The obvious thing I'm missing is a native implementation of upvalue, which is TCL primitive allowing you to affect/update variables in higher-scopes. Without that you can't write things as nicely as you would like, and have to fall back to horrid hacks or be unable to do things.

# define a procedure to run a body N times
proc repeat {n body} {
    set res ""
    while {> $n 0} {
        decr n
        set res [$body]
    }
    $res
}

# test it out
set foo 12
repeat 5 { incr foo }

#  foo is now 17 (i.e. 12 + 5)

A similar story implementing the loop word, which should allow you to set the contents of a variable and run a body a number of times:

proc loop {var min max bdy} {
    // result
    set res ""

    // set the variable.  Horrid.
    // We miss upvalue here.
    eval "set $var [set min]"

    // Run the test
    while {<= [set "$$var"] $max } {
        set res [$bdy]

        // This is a bit horrid
        // We miss upvalue here, and not for the first time.
        eval {incr "$var"}
    }

    // return the last result
    $res
}


loop cur 0 10 { puts "current iteration $cur ($min->$max)" }
# output is:
# => current iteration 0 (0-10)
# => current iteration 1 (0-10)
# ...

That said I did have fun writing some simple test-cases, and implementing assert, assert_equal, etc.

In conclusion I think the number of required primitives needed to implement your own control-flow, and run-time behaviour, is a bit higher than I'd like. Writing switch, repeat, while, and similar primitives inside TCL is harder than creating those same things in FORTH, for example.

01 July, 2022 07:00PM

June 16, 2022

Dima Kogan

Ricoh GR IIIx 802.11 reverse engineering

I just got a fancy new camera: Ricoh GR IIIx. It's pretty great, and I strongly recommend it to anyone that wants a truly pocketable camera with fantastic image quality and full manual controls. One annoyance is the connectivity. It does have both Bluetooth and 802.11, but the only official method of using them is some dinky closed phone app. This is silly. I just did some reverse-engineering, and I now have a functional shell script to download the last few images via 802.11. This is more convenient than plugging in a wire or pulling out the memory card. Fortunately, Ricoh didn't bend over backwards to make the reversing difficult, so to figure it out I didn't even need to download the phone app, and sniff the traffic.

When you turn on the 802.11 on the camera, it says stuff about essid and password, so clearly the camera runs its own access point. Not ideal, but it's good-enough. I connected, and ran nmap to find hosts and open ports: only port 80 on 192.168.0.1 is open. Pointing curl at it yields some error, so I need to figure out the valid endpoints. I downloaded the firmware binary, and tried to figure out what's in it:

dima@shorty:/tmp$ binwalk fwdc243b.bin

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
3036150       0x2E53F6        Cisco IOS microcode, for "8"
3164652       0x3049EC        Certificate in DER format (x509 v3), header length: 4, sequence length: 5412
5472143       0x537F8F        Copyright string: "Copyright ("
6128763       0x5D847B        PARity archive data - file number 90
10711634      0xA37252        gzip compressed data, maximum compression, from Unix, last modified: 2022-02-15 05:47:23
13959724      0xD5022C        MySQL ISAM compressed data file Version 11
24829873      0x17ADFB1       MySQL MISAM compressed data file Version 4
24917663      0x17C369F       MySQL MISAM compressed data file Version 4
24918526      0x17C39FE       MySQL MISAM compressed data file Version 4
24921612      0x17C460C       MySQL MISAM compressed data file Version 4
24948153      0x17CADB9       MySQL MISAM compressed data file Version 4
25221672      0x180DA28       MySQL MISAM compressed data file Version 4
25784158      0x1896F5E       Cisco IOS microcode, for "\"
26173589      0x18F6095       MySQL MISAM compressed data file Version 4
28297588      0x1AFC974       MySQL ISAM compressed data file Version 6
28988307      0x1BA5393       MySQL ISAM compressed data file Version 3
28990184      0x1BA5AE8       MySQL MISAM index file Version 3
29118867      0x1BC5193       MySQL MISAM index file Version 3
29449193      0x1C15BE9       JPEG image data, JFIF standard 1.01
29522133      0x1C278D5       JPEG image data, JFIF standard 1.08
29522412      0x1C279EC       Copyright string: "Copyright ("
29632931      0x1C429A3       JPEG image data, JFIF standard 1.01
29724094      0x1C58DBE       JPEG image data, JFIF standard 1.01

The gzip chunk looks like what I want:

dima@shorty:/tmp$ tail -c+10711635 fwdc243b.bin> /tmp/tst.gz


dima@shorty:/tmp$ < /tmp/tst.gz gunzip | file -

/dev/stdin: ASCII cpio archive (SVR4 with no CRC)


dima@shorty:/tmp$ < /tmp/tst.gz gunzip > tst.cpio

OK, we have some .cpio thing. It's plain-text. I grep around it in, looking for GET and POST and such, and I see various URI-looking things at /v1/..... Grepping for that I see

dima@shorty:/tmp$ strings tst.cpio | grep /v1/

GET /v1/debug/revisions
GET /v1/ping
GET /v1/photos
GET /v1/props
PUT /v1/params/device
PUT /v1/params/lens
PUT /v1/params/camera
GET /v1/liveview
GET /v1/transfers
POST /v1/device/finish
POST /v1/device/wlan/finish
POST /v1/lens/focus
POST /v1/camera/shoot
POST /v1/camera/shoot/compose
POST /v1/camera/shoot/cancel
GET /v1/photos/{}/{}
GET /v1/photos/{}/{}/info
PUT /v1/photos/{}/{}/transfer
/v1/photos/<string>/<string>
/v1/photos/<string>/<string>/info
/v1/photos/<string>/<string>/transfer
/v1/device/finish
/v1/device/wlan/finish
/v1/lens/focus
/v1/camera/shoot
/v1/camera/shoot/compose
/v1/camera/shoot/cancel
/v1/changes
/v1/changes message received.
/v1/changes issue event.
/v1/changes new websocket connection.
/v1/changes websocket connection closed. reason({})
/v1/transfers, transferState({}), afterIndex({}), limit({})

Jackpot. I pointed curl at most of these, and they do interesting things. Generally they all spit out JSON. /v1/liveview sends out a sequence of JPEG images. The thing I care about is /v1/photos/DIRECTORY/FILE and /v1/photos/DIRECTORY/FILE/info. The result is a script I just wrote to connect to the camera, download N images, and connect back to the original access point:

https://github.com/dkogan/ricoh-download

Kinda crude, but works for now. I'll improve it with time.

After I did this I found an old thread from 2015 where somebody was using an apparently-compatible camera, and wrote a fancier tool:

https://www.pentaxforums.com/forums/184-pentax-k-s1-k-s2/295501-k-s2-wifi-laptop-2.html

16 June, 2022 10:04PM by Dima Kogan

June 12, 2022

Iustin Pop

Somewhat committing to a new sport

Quite a few years ago - 4, to be precise, so in 2018 - I did a couple of SUP trainings, organised by a colleague. That was enjoyable, but not really matching with me (asymmetric paddling, ugh!), so I also did learn some kayaking, which I really love, but that’s way higher overhead - no sea around in Switzerland, and lakes are generally too small. So I basically postponed any more water sports 😞, until sometime in the future when I’ll finally decide what I want to do (and in what setup).

I did a couple of one-off SUP rides in various places (2019, 2021), but I really was out of practice, so it wasn’t really enjoyable. But with family, SUP offers a much easier way to carry a passenger (than a kayak), so slowly I started thinking more about doing it more seriously.

So last week, after much deliberation, bought an inflatable board, paddle and various other accessories, and on Saturday went to try it out, on excellent weather (completely flat) and hot but not overly so. The board choosing in itself was something I like to do (research options), so for a bit I was concerned whether I’m more interested in the gear, or the actual paddling itself…

To my surprise, it went way better than I feared - last time I tried it, paddled 30 minutes on my knees (knee-paddling?!), since I didn’t dare stand up. But this time, I launched and then did stand up, and while very shaky, I didn’t fall in. Neither by myself, nor with an extra passenger 😉

And hour later, and my initial shakiness went away, with the trainings slowly coming back to mind. Another half hour, and - for completely flat water - I felt quite confident. The view was awesome, the weather nice, the water cold enough to be refreshing… and the only question on my mind was - why didn’t I do this 2, 3 years ago? Well, Corona aside.

I forgot how much I love just being on the water. It definitely pays off the cost of going somewhere, unpacking the stuff, pumping up the board (that’s a bit of a sport in itself 😃), because the blue-green-light-blue colour palette is just how things should be:

Small lake, but beautiful view
Small lake, but beautiful view

Well, approximately blue. This being a small lake, it’s more blue-green than proper blue. That’s next level, since bigger lakes mean waves, and more traffic.

Of course, this could also turn up like many other things I tried (a device in a corner that’s not used anymore), but at least for yesterday, I was a happy paddler!

12 June, 2022 09:00PM

July 21, 2022

Antoine Beaupré

Matrix notes

I have some concerns about Matrix (the protocol, not the movie that came out recently, although I do have concerns about that as well). I've been watching the project for a long time, and it seems more a promising alternative to many protocols like IRC, XMPP, and Signal.

This review may sound a bit negative, because it focuses on those concerns. I am the operator of an IRC network and people keep asking me to bridge it with Matrix. I have myself considered just giving up on IRC and converting to Matrix. This space is a living document exploring my research of that problem space. The TL;DR: is that no, I'm not setting up a bridge just yet, and I'm still on IRC.

This article was written over the course of the last three months, but I have been watching the Matrix project for years (my logs seem to say 2016 at least). The article is rather long. It will likely take you half an hour to read, so copy this over to your ebook reader, your tablet, or dead trees, and lean back and relax as I show you around the Matrix. Or, alternatively, just jump to a section that interest you, most likely the conclusion.

Introduction to Matrix

Matrix is an "open standard for interoperable, decentralised, real-time communication over IP. It can be used to power Instant Messaging, VoIP/WebRTC signalling, Internet of Things communication - or anywhere you need a standard HTTP API for publishing and subscribing to data whilst tracking the conversation history".

It's also (when compared with XMPP) "an eventually consistent global JSON database with an HTTP API and pubsub semantics - whilst XMPP can be thought of as a message passing protocol."

According to their FAQ, the project started in 2014, has about 20,000 servers, and millions of users. Matrix works over HTTPS but over a special port: 8448.

Security and privacy

I have some concerns about the security promises of Matrix. It's advertised as a "secure" with "E2E [end-to-end] encryption", but how does it actually work?

Data retention defaults

One of my main concerns with Matrix is data retention, which is a key part of security in a threat model where (for example) an hostile state actor wants to surveil your communications and can seize your devices.

On IRC, servers don't actually keep messages all that long: they pass them along to other servers and clients as fast as they can, only keep them in memory, and move on to the next message. There are no concerns about data retention on messages (and their metadata) other than the network layer. (I'm ignoring the issues with user registration, which is a separate, if valid, concern.) Obviously, an hostile server could log everything passing through it, but IRC federations are normally tightly controlled. So, if you trust your IRC operators, you should be fairly safe. Obviously, clients can (and often do, even if OTR is configured!) log all messages, but this is generally not the default. Irssi, for example, does not log by default. IRC bouncers are more likely to log to disk, of course, to be able to do what they do.

Compare this to Matrix: when you send a message to a Matrix homeserver, that server first stores it in its internal SQL database. Then it will transmit that message to all clients connected to that server and room, and to all other servers that have clients connected to that room. Those remote servers, in turn, will keep a copy of that message and all its metadata in their own database, by default forever. On encrypted rooms those messages are encrypted, but not their metadata.

There is a mechanism to expire entries in Synapse, but it is not enabled by default. So one should generally assume that a message sent on Matrix is never expired.

GDPR in the federation

But even if that setting was enabled by default, how do you control it? This is a fundamental problem of the federation: if any user is allowed to join a room (which is the default), those user's servers will log all content and metadata from that room. That includes private, one-on-one conversations, since those are essentially rooms as well.

In the context of the GDPR, this is really tricky: who is the responsible party (known as the "data controller") here? It's basically any yahoo who fires up a home server and joins a room.

In a federated network, one has to wonder whether GDPR enforcement is even possible at all. But in Matrix in particular, if you want to enforce your right to be forgotten in a given room, you would have to:

  1. enumerate all the users that ever joined the room while you were there
  2. discover all their home servers
  3. start a GDPR procedure against all those servers

I recognize this is a hard problem to solve while still keeping an open ecosystem. But I believe that Matrix should have much stricter defaults towards data retention than right now. Message expiry should be enforced by default, for example. (Note that there are also redaction policies that could be used to implement part of the GDPR automatically, see the privacy policy discussion below on that.)

Also keep in mind that, in the brave new peer-to-peer world that Matrix is heading towards, the boundary between server and client is likely to be fuzzier, which would make applying the GDPR even more difficult.

Update: this comment links to this post (in german) which apparently studied the question and concluded that Matrix is not GDPR-compliant.

In fact, maybe Synapse should be designed so that there's no configurable flag to turn off data retention. A bit like how most system loggers in UNIX (e.g. syslog) come with a log retention system that typically rotate logs after a few weeks or month. Historically, this was designed to keep hard drives from filling up, but it also has the added benefit of limiting the amount of personal information kept on disk in this modern day. (Arguably, syslog doesn't rotate logs on its own, but, say, Debian GNU/Linux, as an installed system, does have log retention policies well defined for installed packages, and those can be discussed. And "no expiry" is definitely a bug.

Matrix.org privacy policy

When I first looked at Matrix, five years ago, Element.io was called Riot.im and had a rather dubious privacy policy:

We currently use cookies to support our use of Google Analytics on the Website and Service. Google Analytics collects information about how you use the Website and Service.

[...]

This helps us to provide you with a good experience when you browse our Website and use our Service and also allows us to improve our Website and our Service.

When I asked Matrix people about why they were using Google Analytics, they explained this was for development purposes and they were aiming for velocity at the time, not privacy (paraphrasing here).

They also included a "free to snitch" clause:

If we are or believe that we are under a duty to disclose or share your personal data, we will do so in order to comply with any legal obligation, the instructions or requests of a governmental authority or regulator, including those outside of the UK.

Those are really broad terms, above and beyond what is typically expected legally.

Like the current retention policies, such user tracking and ... "liberal" collaboration practices with the state set a bad precedent for other home servers.

Thankfully, since the above policy was published (2017), the GDPR was "implemented" (2018) and it seems like both the Element.io privacy policy and the Matrix.org privacy policy have been somewhat improved since.

Notable points of the new privacy policies:

  • 2.3.1.1: the "federation" section actually outlines that "Federated homeservers and Matrix clients which respect the Matrix protocol are expected to honour these controls and redaction/erasure requests, but other federated homeservers are outside of the span of control of Element, and we cannot guarantee how this data will be processed"
  • 2.6: users under the age of 16 should not use the matrix.org service
  • 2.10: Upcloud, Mythic Beast, Amazon, and CloudFlare possibly have access to your data (it's nice to at least mention this in the privacy policy: many providers don't even bother admitting to this kind of delegation)
  • Element 2.2.1: mentions many more third parties (Twilio, Stripe, Quaderno, LinkedIn, Twitter, Google, Outplay, PipeDrive, HubSpot, Posthog, Sentry, and Matomo (phew!) used when you are paying Matrix.org for hosting

I'm not super happy with all the trackers they have on the Element platform, but then again you don't have to use that service. Your favorite homeserver (assuming you are not on Matrix.org) probably has their own Element deployment, hopefully without all that garbage.

Overall, this is all a huge improvement over the previous privacy policy, so hats off to the Matrix people for figuring out a reasonable policy in such a tricky context. I particularly like this bit:

We will forget your copy of your data upon your request. We will also forward your request to be forgotten onto federated homeservers. However - these homeservers are outside our span of control, so we cannot guarantee they will forget your data.

It's great they implemented those mechanisms and, after all, if there's an hostile party in there, nothing can prevent them from using screenshots to just exfiltrate your data away from the client side anyways, even with services typically seen as more secure, like Signal.

As an aside, I also appreciate that Matrix.org has a fairly decent code of conduct, based on the TODO CoC which checks all the boxes in the geekfeminism wiki.

Metadata handling

Overall, privacy protections in Matrix mostly concern message contents, not metadata. In other words, who's talking with who, when and from where is not well protected. Compared to a tool like Signal, which goes through great lengths to anonymize that data with features like private contact discovery, disappearing messages, sealed senders, and private groups, Matrix is definitely behind. (Note: there is an issue open about message lifetimes in Element since 2020, but it's not at even at the MSC stage yet.)

This is a known issue (opened in 2019) in Synapse, but this is not just an implementation issue, it's a flaw in the protocol itself. Home servers keep join/leave of all rooms, which gives clear text information about who is talking to. Synapse logs may also contain privately identifiable information that home server admins might not be aware of in the first place. Those log rotation policies are separate from the server-level retention policy, which may be confusing for a novice sysadmin.

Combine this with the federation: even if you trust your home server to do the right thing, the second you join a public room with third-party home servers, those ideas kind of get thrown out because those servers can do whatever they want with that information. Again, a problem that is hard to solve in any federation.

To be fair, IRC doesn't have a great story here either: any client knows not only who's talking to who in a room, but also typically their client IP address. Servers can (and often do) obfuscate this, but often that obfuscation is trivial to reverse. Some servers do provide "cloaks" (sometimes automatically), but that's kind of a "slap-on" solution that actually moves the problem elsewhere: now the server knows a little more about the user.

Overall, I would worry much more about a Matrix home server seizure than a IRC or Signal server seizure. Signal does get subpoenas, and they can only give out a tiny bit of information about their users: their phone number, and their registration, and last connection date. Matrix carries a lot more information in its database.

Amplification attacks on URL previews

I (still!) run an Icecast server and sometimes share links to it on IRC which, obviously, also ends up on (more than one!) Matrix home servers because some people connect to IRC using Matrix. This, in turn, means that Matrix will connect to that URL to generate a link preview.

I feel this outlines a security issue, especially because those sockets would be kept open seemingly forever. I tried to warn the Matrix security team but somehow, I don't think this issue was taken very seriously. Here's the disclosure timeline:

  • January 18: contacted Matrix security
  • January 19: response: already reported as a bug
  • January 20: response: can't reproduce
  • January 31: timeout added, considered solved
  • January 31: I respond that I believe the security issue is underestimated, ask for clearance to disclose
  • February 1: response: asking for two weeks delay after the next release (1.53.0) including another patch, presumably in two weeks' time
  • February 22: Matrix 1.53.0 released
  • April 14: I notice the release, ask for clearance again
  • April 14: response: referred to the public disclosure

There are a couple of problems here:

  1. the bug was publicly disclosed in September 2020, and not considered a security issue until I notified them, and even then, I had to insist

  2. no clear disclosure policy timeline was proposed or seems established in the project (there is a security disclosure policy but it doesn't include any predefined timeline)

  3. I wasn't informed of the disclosure

  4. the actual solution is a size limit (10MB, already implemented), a time limit (30 seconds, implemented in PR 11784), and a content type allow list (HTML, "media" or JSON, implemented in PR 11936), and I'm not sure it's adequate

  5. (pure vanity:) I did not make it to their Hall of fame

I'm not sure those solutions are adequate because they all seem to assume a single home server will pull that one URL for a little while then stop. But in a federated network, many (possibly thousands) home servers may be connected in a single room at once. If an attacker drops a link into such a room, all those servers would connect to that link all at once. This is an amplification attack: a small amount of traffic will generate a lot more traffic to a single target. It doesn't matter there are size or time limits: the amplification is what matters here.

It should also be noted that clients that generate link previews have more amplification because they are more numerous than servers. And of course, the default Matrix client (Element) does generate link previews as well.

That said, this is possibly not a problem specific to Matrix: any federated service that generates link previews may suffer from this.

I'm honestly not sure what the solution is here. Maybe moderation? Maybe link previews are just evil? All I know is there was this weird bug in my Icecast server and I tried to ring the bell about it, and it feels it was swept under the rug. Somehow I feel this is bound to blow up again in the future, even with the current mitigation.

Moderation

In Matrix like elsewhere, Moderation is a hard problem. There is a detailed moderation guide and much of this problem space is actively worked on in Matrix right now. A fundamental problem with moderating a federated space is that a user banned from a room can rejoin the room from another server. This is why spam is such a problem in Email, and why IRC networks have stopped federating ages ago (see the IRC history for that fascinating story).

The mjolnir bot

The mjolnir moderation bot is designed to help with some of those things. It can kick and ban users, redact all of a user's message (as opposed to one by one), all of this across multiple rooms. It can also subscribe to a federated block list published by matrix.org to block known abusers (users or servers). Bans are pretty flexible and can operate at the user, room, or server level.

Matrix people suggest making the bot admin of your channels, because you can't take back admin from a user once given.

The command-line tool

There's also a new command line tool designed to do things like:

  • System notify users (all users/users from a list, specific user)
  • delete sessions/devices not seen for X days
  • purge the remote media cache
  • select rooms with various criteria (external/local/empty/created by/encrypted/cleartext)
  • purge history of theses rooms
  • shutdown rooms

This tool and Mjolnir are based on the admin API built into Synapse.

Rate limiting

Synapse has pretty good built-in rate-limiting which blocks repeated login, registration, joining, or messaging attempts. It may also end up throttling servers on the federation based on those settings.

Fundamental federation problems

Because users joining a room may come from another server, room moderators are at the mercy of the registration and moderation policies of those servers. Matrix is like IRC's +R mode ("only registered users can join") by default, except that anyone can register their own homeserver, which makes this limited.

Server admins can block IP addresses and home servers, but those tools are not easily available to room admins. There is an API (m.room.server_acl in /devtools) but it is not reliable (thanks Austin Huang for the clarification).

Matrix has the concept of guest accounts, but it is not used very much, and virtually no client or homeserver supports it. This contrasts with the way IRC works: by default, anyone can join an IRC network even without authentication. Some channels require registration, but in general you are free to join and look around (until you get blocked, of course).

I have seen anecdotal evidence (CW: Twitter, nitter link) that "moderating bridges is hell", and I can imagine why. Moderation is already hard enough on one federation, when you bridge a room with another network, you inherit all the problems from that network but without the entire abuse control tools from the original network's API...

Room admins

Matrix, in particular, has the problem that room administrators (which have the power to redact messages, ban users, and promote other users) are bound to their Matrix ID which is, in turn, bound to their home servers. This implies that a home server administrators could (1) impersonate a given user and (2) use that to hijack the room. So in practice, the home server is the trust anchor for rooms, not the user themselves.

That said, if server B administrator hijack user joe on server B, they will hijack that room on that specific server. This will not (necessarily) affect users on the other servers, as servers could refuse parts of the updates or ban the compromised account (or server).

It does seem like a major flaw that room credentials are bound to Matrix identifiers, as opposed to the E2E encryption credentials. In an encrypted room even with fully verified members, a compromised or hostile home server can still take over the room by impersonating an admin. That admin (or even a newly minted user) can then send events or listen on the conversations.

This is even more frustrating when you consider that Matrix events are actually signed and therefore have some authentication attached to them, acting like some sort of Merkle tree (as it contains a link to previous events). That signature, however, is made from the homeserver PKI keys, not the client's E2E keys, which makes E2E feel like it has been "bolted on" later.

Availability

While Matrix has a strong advantage over Signal in that it's decentralized (so anyone can run their own homeserver,), I couldn't find an easy way to run a "multi-primary" setup, or even a "redundant" setup (even if with a single primary backend), short of going full-on "replicate PostgreSQL and Redis data", which is not typically for the faint of heart.

How this works in IRC

On IRC, it's quite easy to setup redundant nodes. All you need is:

  1. a new machine (with it's own public address with an open port)

  2. a shared secret (or certificate) between that machine and an existing one on the network

  3. a connect {} block on both servers

That's it: the node will join the network and people can connect to it as usual and share the same user/namespace as the rest of the network. The servers take care of synchronizing state: you do not need to worry about replicating a database server.

(Now, experienced IRC people will know there's a catch here: IRC doesn't have authentication built in, and relies on "services" which are basically bots that authenticate users (I'm simplifying, don't nitpick). If that service goes down, the network still works, but then people can't authenticate, and they can start doing nasty things like steal people's identity if they get knocked offline. But still: basic functionality still works: you can talk in rooms and with users that are on the reachable network.)

User identities

Matrix is more complicated. Each "home server" has its own identity namespace: a specific user (say @anarcat:matrix.org) is bound to that specific home server. If that server goes down, that user is completely disconnected. They could register a new account elsewhere and reconnect, but then they basically lose all their configuration: contacts, joined channels are all lost.

(Also notice how the Matrix IDs don't look like a typical user address like an email in XMPP. They at least did their homework and got the allocation for the scheme.)

Rooms

Users talk to each other in "rooms", even in one-to-one communications. (Rooms are also used for other things like "spaces", they're basically used for everything, think "everything is a file" kind of tool.) For rooms, home servers act more like IRC nodes in that they keep a local state of the chat room and synchronize it with other servers. Users can keep talking inside a room if the server that originally hosts the room goes down. Rooms can have a local, server-specific "alias" so that, say, #room:matrix.org is also visible as #room:example.com on the example.com home server. Both addresses refer to the same room underlying room.

(Finding this in the Element settings is not obvious though, because that "alias" are actually called a "local address" there. So to create such an alias (in Element), you need to go in the room settings' "General" section, "Show more" in "Local address", then add the alias name (e.g. foo), and then that room will be available on your example.com homeserver as #foo:example.com.)

So a room doesn't belong to a server, it belongs to the federation, and anyone can join the room from any serer (if the room is public, or if invited otherwise). You can create a room on server A and when a user from server B joins, the room will be replicated on server B as well. If server A fails, server B will keep relaying traffic to connected users and servers.

A room is therefore not fundamentally addressed with the above alias, instead ,it has a internal Matrix ID, which basically a random string. It has a server name attached to it, but that was made just to avoid collisions. That can get a little confusing. For example, the #fractal:gnome.org room is an alias on the gnome.org server, but the room ID is !hwiGbsdSTZIwSRfybq:matrix.org. That's because the room was created on matrix.org, but the preferred branding is gnome.org now.

As an aside, rooms, by default, live forever, even after the last user quits. There's an admin API to delete rooms and a tombstone event to redirect to another one, but neither have a GUI yet. The latter is part of MSC1501 ("Room version upgrades") which allows a room admin to close a room, with a message and a pointer to another room.

Spaces

Discovering rooms can be tricky: there is a per-server room directory, but Matrix.org people are trying to deprecate it in favor of "Spaces". Room directories were ripe for abuse: anyone can create a room, so anyone can show up in there. It's possible to restrict who can add aliases, but anyways directories were seen as too limited.

In contrast, a "Space" is basically a room that's an index of other rooms (including other spaces), so existing moderation and administration mechanism that work in rooms can (somewhat) work in spaces as well. This enables a room directory that works across federation, regardless on which server they were originally created.

New users can be added to a space or room automatically in Synapse. (Existing users can be told about the space with a server notice.) This gives admins a way to pre-populate a list of rooms on a server, which is useful to build clusters of related home servers, providing some sort of redundancy, at the room -- not user -- level.

Home servers

So while you can workaround a home server going down at the room level, there's no such thing at the home server level, for user identities. So if you want those identities to be stable in the long term, you need to think about high availability. One limitation is that the domain name (e.g. matrix.example.com) must never change in the future, as renaming home servers is not supported.

The documentation used to say you could "run a hot spare" but that has been removed. Last I heard, it was not possible to run a high-availability setup where multiple, separate locations could replace each other automatically. You can have high performance setups where the load gets distributed among workers, but those are based on a shared database (Redis and PostgreSQL) backend.

So my guess is it would be possible to create a "warm" spare server of a matrix home server with regular PostgreSQL replication, but that is not documented in the Synapse manual. This sort of setup would also not be useful to deal with networking issues or denial of service attacks, as you will not be able to spread the load over multiple network locations easily. Redis and PostgreSQL heroes are welcome to provide their multi-primary solution in the comments. In the meantime, I'll just point out this is a solution that's handled somewhat more gracefully in IRC, by having the possibility of delegating the authentication layer.

Update: this was previously undocumented, but not only can you scale the frontend workers to multiple hosts, you can also shard the backend so that tables are distributed across multiple database hots. This has been documented only on 2022-07-11, weeks after this article was written, so you will forgive me for that omission, hopefully. Obviously, this doesn't resolve the "high availability" scenario since you still have a central server for that data, but it might help resolving performance problems for very large instances.

Delegations

If you do not want to run a Matrix server yourself, it's possible to delegate the entire thing to another server. There's a server discovery API which uses the .well-known pattern (or SRV records, but that's "not recommended" and a bit confusing) to delegate that service to another server. Be warned that the server still needs to be explicitly configured for your domain. You can't just put:

{ "m.server": "matrix.org:443" }

... on https://example.com/.well-known/matrix/server and start using @you:example.com as a Matrix ID. That's because Matrix doesn't support "virtual hosting" and you'd still be connecting to rooms and people with your matrix.org identity, not example.com as you would normally expect. This is also why you cannot rename your home server.

The server discovery API is what allows servers to find each other. Clients, on the other hand, use the client-server discovery API: this is what allows a given client to find your home server when you type your Matrix ID on login.

Performance

The high availability discussion brushed over the performance of Matrix itself, but let's now dig into that.

Horizontal scalability

There were serious scalability issues of the main Matrix server, Synapse, in the past. So the Matrix team has been working hard to improve its design. Since Synapse 1.22 the home server can horizontally scale to multiple workers (see this blog post for details) which can make it easier to scale large servers.

Other implementations

There are other promising home servers implementations from a performance standpoint (dendrite, Golang, entered beta in late 2020; conduit, Rust, beta; others), but none of those are feature-complete so there's a trade-off to be made there. Synapse is also adding a lot of feature fast, so it's an open question whether the others will ever catch up. (I have heard that Dendrite might actually surpass Synapse in features within a few years, which would put Synapse in a more "LTS" situation.)

Latency

Matrix can feel slow sometimes. For example, joining the "Matrix HQ" room in Element (from matrix.debian.social) takes a few minutes and then fails. That is because the home server has to sync the entire room state when you join the room. There was promising work on this announced in the lengthy 2021 retrospective, and some of that work landed (partial sync) in the 1.53 release already. Other improvements coming include sliding sync, lazy loading over federation, and fast room joins. So that's actually something that could be fixed in the fairly short term.

But in general, communication in Matrix doesn't feel as "snappy" as on IRC or even Signal. It's hard to quantify this without instrumenting a full latency test bed (for example the tools I used in the terminal emulators latency tests), but even just typing in a web browser feels slower than typing in a xterm or Emacs for me.

Even in conversations, I "feel" people don't immediately respond as fast. In fact, this could be an interesting double-blind experiment to make: have people guess whether they are talking to a person on Matrix, XMPP, or IRC, for example. My theory would be that people could notice that Matrix users are slower, if only because of the TCP round-trip time each message has to take.

Transport

Some courageous person actually made some tests of various messaging platforms on a congested network. His evaluation was basically:

  • Briar: uses Tor, so unusable except locally
  • Matrix: "struggled to send and receive messages", joining a room takes forever as it has to sync all history, "took 20-30 seconds for my messages to be sent and another 20 seconds for further responses"
  • XMPP: "worked in real-time, full encryption, with nearly zero lag"

So that was interesting. I suspect IRC would have also fared better, but that's just a feeling.

Other improvements to the transport layer include support for websocket and the CoAP proxy work from 2019 (targeting 100bps links), but both seem stalled at the time of writing. The Matrix people have also announced the pinecone p2p overlay network which aims at solving large, internet-scale routing problems. See also this talk at FOSDEM 2022.

Usability

Onboarding and workflow

The workflow for joining a room, when you use Element web, is not great:

  1. click on a link in a web browser
  2. land on (say) https://matrix.to/#/#matrix-dev:matrix.org
  3. offers "Element", yeah that's sounds great, let's click "Continue"
  4. land on https://app.element.io/#/room%2F%23matrix-dev%3Amatrix.org and then you need to register, aaargh

As you might have guessed by now, there is a specification to solve this, but web browsers need to adopt it as well, so that's far from actually being solved. At least browsers generally know about the matrix: scheme, it's just not exactly clear what they should do with it, especially when the handler is just another web page (e.g. Element web).

In general, when compared with tools like Signal or WhatsApp, Matrix doesn't fare so well in terms of user discovery. I probably have some of my normal contacts that have a Matrix account as well, but there's really no way to know. It's kind of creepy when Signal tells you "this person is on Signal!" but it's also pretty cool that it works, and they actually implemented it pretty well.

Registration is also less obvious: in Signal, the app confirms your phone number automatically. It's friction-less and quick. In Matrix, you need to learn about home servers, pick one, register (with a password! aargh!), and then setup encryption keys (not default), etc. It's a lot more friction.

And look, I understand: giving away your phone number is a huge trade-off. I don't like it either. But it solves a real problem and makes encryption accessible to a ton more people. Matrix does have "identity servers" that can serve that purpose, but I don't feel confident sharing my phone number there. It doesn't help that the identity servers don't have private contact discovery: giving them your phone number is a more serious security compromise than with Signal.

There's a catch-22 here too: because no one feels like giving away their phone numbers, no one does, and everyone assumes that stuff doesn't work anyways. Like it or not, Signal forcing people to divulge their phone number actually gives them critical mass that means actually a lot of my relatives are on Signal and I don't have to install crap like WhatsApp to talk with them.

5 minute clients evaluation

Throughout all my tests I evaluated a handful of Matrix clients, mostly from Flathub because almost none of them are packaged in Debian.

Right now I'm using Element, the flagship client from Matrix.org, in a web browser window, with the PopUp Window extension. This makes it look almost like a native app, and opens links in my main browser window (instead of a new tab in that separate window), which is nice. But I'm tired of buying memory to feed my web browser, so this indirection has to stop. Furthermore, I'm often getting completely logged off from Element, which means re-logging in, recovering my security keys, and reconfiguring my settings. That is extremely annoying.

Coming from Irssi, Element is really "GUI-y" (pronounced "gooey"). Lots of clickety happening. To mark conversations as read, in particular, I need to click-click-click on all the tabs that have some activity. There's no "jump to latest message" or "mark all as read" functionality as far as I could tell. In Irssi the former is built-in (alt-a) and I made a custom /READ command for the latter:

/ALIAS READ script exec \$_->activity(0) for Irssi::windows

And yes, that's a Perl script in my IRC client. I am not aware of any Matrix client that does stuff like that, except maybe Weechat, if we can call it a Matrix client, or Irssi itself, now that it has a Matrix plugin (!).

As for other clients, I have looked through the Matrix Client Matrix (confusing right?) to try to figure out which one to try, and, even after selecting Linux as a filter, the chart is just too wide to figure out anything. So I tried those, kind of randomly:

  • Fractal
  • Mirage
  • Nheko
  • Quaternion

Unfortunately, I lost my notes on those, I don't actually remember which one did what. I still have a session open with Mirage, so I guess that means it's the one I preferred, but I remember they were also all very GUI-y.

Maybe I need to look at weechat-matrix or gomuks. At least Weechat is scriptable so I could continue playing the power-user. Right now my strategy with messaging (and that includes microblogging like Twitter or Mastodon) is that everything goes through my IRC client, so Weechat could actually fit well in there. Going with gomuks, on the other hand, would mean running it in parallel with Irssi or ... ditching IRC, which is a leap I'm not quite ready to take just yet.

Oh, and basically none of those clients (except Nheko and Element) support VoIP, which is still kind of a second-class citizen in Matrix. It does not support large multimedia rooms, for example: Jitsi was used for FOSDEM instead of the native videoconferencing system.

Bots

This falls a little aside the "usability" section, but I didn't know where to put this... There's a few Matrix bots out there, and you are likely going to be able to replace your existing bots with Matrix bots. It's true that IRC has a long and impressive history with lots of various bots doing various things, but given how young Matrix is, there's still a good variety:

  • maubot: generic bot with tons of usual plugins like sed, dice, karma, xkcd, echo, rss, reminder, translate, react, exec, gitlab/github webhook receivers, weather, etc
  • opsdroid: framework to implement "chat ops" in Matrix, connects with Matrix, GitHub, GitLab, Shell commands, Slack, etc
  • matrix-nio: another framework, used to build lots more bots like:
    • hemppa: generic bot with various functionality like weather, RSS feeds, calendars, cron jobs, OpenStreetmaps lookups, URL title snarfing, wolfram alpha, astronomy pic of the day, Mastodon bridge, room bridging, oh dear
    • devops: ping, curl, etc
    • podbot: play podcast episodes from AntennaPod
    • cody: Python, Ruby, Javascript REPL
    • eno: generic bot, "personal assistant"
  • mjolnir: moderation bot
  • hookshot: bridge with GitLab/GitHub
  • matrix-monitor-bot: latency monitor

One thing I haven't found an equivalent for is Debian's MeetBot. There's an archive bot but it doesn't have topics or a meeting chair, or HTML logs.

Working on Matrix

As a developer, I find Matrix kind of intimidating. The specification is huge. The official specification itself looks somewhat digestable: it's only 6 APIs so that looks, at first, kind of reasonable. But whenever you start asking complicated questions about Matrix, you quickly fall into the Matrix Spec Change specification (which, yes, is a separate specification). And there are literally hundreds of MSCs flying around. It's hard to tell what's been adopted and what hasn't, and even harder to figure out if your specific client has implemented it.

(One trendy answer to this problem is to "rewrite it in rust": Matrix are working on implementing a lot of those specifications in a matrix-rust-sdk that's designed to take the implementation details away from users.)

Just taking the latest weekly Matrix report, you find that three new MSCs proposed, just last week! There's even a graph that shows the number of MSCs is progressing steadily, at 600+ proposals total, with the majority (300+) "new". I would guess the "merged" ones are at about 150.

That's a lot of text which includes stuff like 3D worlds which, frankly, I don't think you should be working on when you have such important security and usability problems. (The internet as a whole, arguably, doesn't fare much better. RFC600 is a really obscure discussion about "INTERFACING AN ILLINOIS PLASMA TERMINAL TO THE ARPANET". Maybe that's how many MSCs will end up as well, left forgotten in the pits of history.)

And that's the thing: maybe the Matrix people have a different objective than I have. They want to connect everything to everything, and make Matrix a generic transport for all sorts of applications, including virtual reality, collaborative editors, and so on.

I just want secure, simple messaging. Possibly with good file transfers, and video calls. That it works with existing stuff is good, and it should be federated to remove the "Signal point of failure". So I'm a bit worried with the direction all those MSCs are taking, especially when you consider that clients other than Element are still struggling to keep up with basic features like end-to-end encryption or room discovery, never mind voice or spaces...

Conclusion

Overall, Matrix is somehow in the space XMPP was a few years ago. It has a ton of features, pretty good clients, and a large community. It seems to have gained some of the momentum that XMPP has lost. It may have the most potential to replace Signal if something bad would happen to it (like, I don't know, getting banned or going nuts with cryptocurrency)...

But it's really not there yet, and I don't see Matrix trying to get there either, which is a bit worrisome.

Looking back at history

I'm also worried that we are repeating the errors of the past. The history of federated services is really fascinating:. IRC, FTP, HTTP, and SMTP were all created in the early days of the internet, and are all still around (except, arguably, FTP, which was removed from major browsers recently). All of them had to face serious challenges in growing their federation.

IRC had numerous conflicts and forks, both at the technical level but also at the political level. The history of IRC is really something that anyone working on a federated system should study in detail, because they are bound to make the same mistakes if they are not familiar with it. The "short" version is:

  • 1988: Finnish researcher publishes first IRC source code
  • 1989: 40 servers worldwide, mostly universities
  • 1990: EFnet ("eris-free network") fork which blocks the "open relay", named Eris - followers of Eris form the A-net, which promptly dissolves itself, with only EFnet remaining
  • 1992: Undernet fork, which offered authentication ("services"), routing improvements and timestamp-based channel synchronisation
  • 1994: DALnet fork, from Undernet, again on a technical disagreement
  • 1995: Freenode founded
  • 1996: IRCnet forks from EFnet, following a flame war of historical proportion, splitting the network between Europe and the Americas
  • 1997: Quakenet founded
  • 1999: (XMPP founded)
  • 2001: 6 million users, OFTC founded
  • 2002: DALnet peaks at 136,000 users
  • 2003: IRC as a whole peaks at 10 million users, EFnet peaks at 141,000 users
  • 2004: (Facebook founded), Undernet peaks at 159,000 users
  • 2005: Quakenet peaks at 242,000 users, IRCnet peaks at 136,000 (Youtube founded)
  • 2006: (Twitter founded)
  • 2009: (WhatsApp, Pinterest founded)
  • 2010: (TextSecure AKA Signal, Instagram founded)
  • 2011: (Snapchat founded)
  • ~2013: Freenode peaks at ~100,000 users
  • 2016: IRCv3 standardisation effort started (TikTok founded)
  • 2021: Freenode self-destructs, Libera chat founded
  • 2022: Libera peaks at 50,000 users, OFTC peaks at 30,000 users

(The numbers were taken from the Wikipedia page and Netsplit.de. Note that I also include other networks launch in parenthesis for context.)

Pretty dramatic, don't you think? Eventually, somehow, IRC became irrelevant for most people: few people are even aware of it now. With less than a million users active, it's smaller than Mastodon, XMPP, or Matrix at this point.1 If I were to venture a guess, I'd say that infighting, lack of a standardization body, and a somewhat annoying protocol meant the network could not grow. It's also possible that the decentralised yet centralised structure of IRC networks limited their reliability and growth.

But large social media companies have also taken over the space: observe how IRC numbers peak around the time the wave of large social media companies emerge, especially Facebook (2.9B users!!) and Twitter (400M users).

Where the federated services are in history

Right now, Matrix, and Mastodon (and email!) are at the "pre-EFnet" stage: anyone can join the federation. Mastodon has started working on a global block list of fascist servers which is interesting, but it's still an open federation. Right now, Matrix is totally open, but matrix.org publishes a (federated) block list of hostile servers (#matrix-org-coc-bl:matrix.org, yes, of course it's a room).

Interestingly, Email is also in that stage, where there are block lists of spammers, and it's a race between those blockers and spammers. Large email providers, obviously, are getting closer to the EFnet stage: you could consider they only accept email from themselves or between themselves. It's getting increasingly hard to deliver mail to Outlook and Gmail for example, partly because of bias against small providers, but also because they are including more and more machine-learning tools to sort through email and those systems are, fundamentally, unknowable. It's not quite the same as splitting the federation the way EFnet did, but the effect is similar.

HTTP has somehow managed to live in a parallel universe, as it's technically still completely federated: anyone can start a web server if they have a public IP address and anyone can connect to it. The catch, of course, is how you find the darn thing. Which is how Google became one of the most powerful corporations on earth, and how they became the gatekeepers of human knowledge online.

I have only briefly mentioned XMPP here, and my XMPP fans will undoubtedly comment on that, but I think it's somewhere in the middle of all of this. It was co-opted by Facebook and Google, and both corporations have abandoned it to its fate. I remember fondly the days where I could do instant messaging with my contacts who had a Gmail account. Those days are gone, and I don't talk to anyone over Jabber anymore, unfortunately. And this is a threat that Matrix still has to face.

It's also the threat Email is currently facing. On the one hand corporations like Facebook want to completely destroy it and have mostly succeeded: many people just have an email account to register on things and talk to their friends over Instagram or (lately) TikTok (which, I know, is not Facebook, but they started that fire).

On the other hand, you have corporations like Microsoft and Google who are still using and providing email services — because, frankly, you still do need email for stuff, just like fax is still around — but they are more and more isolated in their own silo. At this point, it's only a matter of time they reach critical mass and just decide that the risk of allowing external mail coming in is not worth the cost. They'll simply flip the switch and work on an allow-list principle. Then we'll have closed the loop and email will be dead, just like IRC is "dead" now.

I wonder which path Matrix will take. Could it liberate us from these vicious cycles?

Update: this generated some discussions on lobste.rs.


  1. According to Wikipedia, there are currently about 500 distinct IRC networks operating, on about 1,000 servers, serving over 250,000 users. In contrast, Mastodon seems to be around 5 million users, Matrix.org claimed at FOSDEM 2021 to have about 28 million globally visible accounts, and Signal lays claim to over 40 million souls. XMPP claims to have "millions" of users on the xmpp.org homepage but the FAQ says they don't actually know. On the proprietary silo side of the fence, this page says

    • Facebook: 2.9 billion users
    • WhatsApp: 2B
    • Instagram: 1.4B
    • TikTok: 1B
    • Snapchat: 500M
    • Pinterest: 480M
    • Twitter: 397M

    Notable omission from that list: Youtube, with its mind-boggling 2.6 billion users...

    Those are not the kind of numbers you just "need to convince a brother or sister" to grow the network...

21 July, 2022 03:57PM

June 10, 2022

Iustin Pop

Still alive, 2022 version

Still alive, despite the blog being silent for more than a year.

Nothing bad happened, but there was always something more important (or interesting) to do than write a post. And I did say many, many times - “Oh, I should write a post about this thing I just did or learned about�, but I never followed up.

And I was close to forgetting entirely about blogging (ahem, it’s a bit much calling it “blogging�), until someone I follow posted something along the lines “I have this half-written post for many months that I can’t finish, here’s some pictures instead�. And from that followed an interesting discussion, and the similarity between “why I didn’t blog recently� were very interesting, despite different countries, continents/etc.

So yes, I don’t know what happened - beside the chaos that even the end of Covid caused in our lives, and the psychological impact of the Ukraine invasion, but all this is relatively recent - that I couldn’t muster the energy to write posts again.

I even had a half-written post in late June last year, never finished. Sigh. I won’t even bring up open-source work, since I haven’t done that either.

Life. Sometimes things just happen. But yes, I did get many Garmin badges in the last 12 months 🙂 Oh, and “Top Gun: Maverick� is awesome. A movie, but an awesome movie.

See you!

10 June, 2022 09:22PM

June 11, 2022

hackergotchi for Louis-Philippe Véronneau

Louis-Philippe Véronneau

Updating a rooted Pixel 3a

A short while after getting a Pixel 3a, I decided to root it, mostly to have more control over the charging procedure. In order to preserve battery life, I like my phone to stop charging at around 75% of full battery capacity and to shut down automatically at around 12%. Some Android ROMs have extra settings to manage this, but LineageOS unfortunately does not.

Android already comes with a fairly complex mechanism to handle the charge cycle, but it is mostly controlled by the kernel and cannot be easily configured by end-users. acc is a higher-level "systemless" interface for the Android kernel battery management, but one needs root to do anything interesting with it. Once rooted, you can use the AccA app instead of playing on the command line to fine tune your battery settings.

Sadly, having a rooted phone also means I need to re-root it each time there is an OS update (typically each week).

Somehow, I keep forgetting the exact procedure to do this! Hopefully, I will be able to use this post as a reference in the future :)

Note that these instructions might not apply to your exact phone model, proceed with caution!

Extract the boot.img file

This procedure mostly comes from the LineageOS documentation on extracting proprietary blobs from the payload.

  1. Download the latest LineageOS image for your phone.

  2. unzip the image to get the payload.bin file inside it.

  3. Clone the LineageOS scripts git repository:

    $ git clone https://github.com/LineageOS/scripts

  4. extract the boot image (requires python3-protobuf):

    $ mkdir extracted-payload $ python3 scripts/update-payload-extractor/extract.py payload.bin --output_dir extracted-payload

You should now have a boot.img file.

Patch the boot image file using Magisk

  1. Upload the boot.img file you previously extracted to your device.

  2. Open Magisk and patch the boot.img file.

  3. Download the patched file back on your computer.

Flash the patched boot image

  1. Enable ADB debug mode on your phone.

  2. Reboot into fastboot mode.

    $ adb reboot fastboot

  3. Flash the patched boot image file:

    $ fastboot flash boot magisk_patched-foo.img

  4. Disable ADB debug mode on your phone.

Troubleshooting

In an ideal world, you would do this entire process each time you upgrade to a new LineageOS version. Sadly, this creates friction and makes updating much more troublesome.

To simplify things, you can try to flash an old patched boot.img file after upgrading, instead of generating it each time.

In my experience, it usually works. When it does not, the device behaves weirdly after a reboot and things that require proprietary blobs (like WiFi) will stop working.

If that happens:

  1. Download the latest LineageOS version for your phone.

  2. Reboot into recovery (Power + Volume Down).

  3. Click on "Apply Updates"

  4. Sideload the ROM:

    $ adb sideload lineageos-foo.zip

11 June, 2022 04:00AM by Louis-Philippe Véronneau

June 09, 2022

Enrico Zini

Updating cbqt for bullseye

Back in 2017 I did work to setup a cross-building toolchain for QT Creator, that takes advantage of Debian's packaging for all the dependency ecosystem.

It ended with cbqt which is a little script that sets up a chroot to hold cross-build-dependencies, to avoid conflicting with packages in the host system, and sets up a qmake alternative to make use of them.

Today I'm dusting off that work, to ensure it works on Debian bullseye.

Resetting QT Creator

To make things reproducible, I wanted to reset QT Creator's configuration.

Besides purging and reinstalling the package, one needs to manually remove:

  • ~/.config/QtProject
  • ~/.cache/QtProject/
  • /usr/share/qtcreator/QtProject which is where configuration is stored if you used sdktool to programmatically configure Qt Creator (see for example this post and see Debian bug #1012561.

Updating cbqt

Easy start, change the distribution for the chroot:

-DIST_CODENAME = "stretch"
+DIST_CODENAME = "bullseye"

Adding LIBDIR

Something else does not work:

Test$ qmake-armhf -makefile
Info: creating stash file …/Test/.qmake.stash
Test$ make
[...]
/usr/bin/arm-linux-gnueabihf-g++ -Wl,-O1 -Wl,-rpath-link,…/armhf/lib/arm-linux-gnueabihf -Wl,-rpath-link,…/armhf/usr/lib/arm-linux-gnueabihf -Wl,-rpath-link,…/armhf/usr/lib/ -o Test main.o mainwindow.o moc_mainwindow.o   …/armhf/usr/lib/arm-linux-gnueabihf/libQt5Widgets.so …/armhf/usr/lib/arm-linux-gnueabihf/libQt5Gui.so …/armhf/usr/lib/arm-linux-gnueabihf/libQt5Core.so -lGLESv2 -lpthread
/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/bin/ld: cannot find -lGLESv2
collect2: error: ld returned 1 exit status
make: *** [Makefile:146: Test] Error 1

I figured that now I also need to set QMAKE_LIBDIR and not just QMAKE_RPATHLINKDIR:

--- a/cbqt
+++ b/cbqt
@@ -241,18 +241,21 @@ include(../common/linux.conf)
 include(../common/gcc-base-unix.conf)
 include(../common/g++-unix.conf)

+QMAKE_LIBDIR += {chroot.abspath}/lib/arm-linux-gnueabihf
+QMAKE_LIBDIR += {chroot.abspath}/usr/lib/arm-linux-gnueabihf
+QMAKE_LIBDIR += {chroot.abspath}/usr/lib/
 QMAKE_RPATHLINKDIR += {chroot.abspath}/lib/arm-linux-gnueabihf
 QMAKE_RPATHLINKDIR += {chroot.abspath}/usr/lib/arm-linux-gnueabihf
 QMAKE_RPATHLINKDIR += {chroot.abspath}/usr/lib/

Now it links again:

Test$ qmake-armhf -makefile
Test$ make
/usr/bin/arm-linux-gnueabihf-g++ -Wl,-O1 -Wl,-rpath-link,…/armhf/lib/arm-linux-gnueabihf -Wl,-rpath-link,…/armhf/usr/lib/arm-linux-gnueabihf -Wl,-rpath-link,…/armhf/usr/lib/ -o Test main.o mainwindow.o moc_mainwindow.o   -L…/armhf/lib/arm-linux-gnueabihf -L…/armhf/usr/lib/arm-linux-gnueabihf -L…/armhf/usr/lib/ …/armhf/usr/lib/arm-linux-gnueabihf/libQt5Widgets.so …/armhf/usr/lib/arm-linux-gnueabihf/libQt5Gui.so …/armhf/usr/lib/arm-linux-gnueabihf/libQt5Core.so -lGLESv2 -lpthread

Making it work in Qt Creator

Time to try it in Qt Creator, and sadly it fails:

/armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/features/toolchain.prf:76: Variable QMAKE_CXX.COMPILER_MACROS is not defined.

QMAKE_CXX.COMPILER_MACROS is not defined

I traced it to this bit in armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/features/toolchain.prf (nonrelevant bits deleted):

isEmpty($${target_prefix}.COMPILER_MACROS) {
    msvc {
        # …
    } else: gcc|ghs {
        vars = $$qtVariablesFromGCC($$QMAKE_CXX)
    }
    for (v, vars) {
        # …
        $${target_prefix}.COMPILER_MACROS += $$v
    }
    cache($${target_prefix}.COMPILER_MACROS, set stash)
} else {
    # …
}

It turns out that qmake is not able to realise that the compiler is gcc, so vars does not get set, nothing is set in COMPILER_MACROS, and qmake fails.

Reproducing it on the command line

When run manually, however, qmake-armhf worked, so it would be good to know how Qt Creator is actually running qmake. Since it frustratingly does not show what commands it runs, I'll have to strace it:

strace -e trace=execve --string-limit=123456 -o qtcreator.trace -f qtcreator

And there it is:

$ grep qmake- qtcreator.trace
1015841 execve("/usr/local/bin/qmake-armhf", ["/usr/local/bin/qmake-armhf", "-query"], 0x56096e923040 /* 54 vars */) = 0
1015865 execve("/usr/local/bin/qmake-armhf", ["/usr/local/bin/qmake-armhf", "…/Test/Test.pro", "-spec", "arm-linux-gnueabihf", "CONFIG+=debug", "CONFIG+=qml_debug"], 0x7f5cb4023e20 /* 55 vars */) = 0

I run the command manually and indeed I reproduce the problem:

$ /usr/local/bin/qmake-armhf Test.pro -spec arm-linux-gnueabihf CONFIG+=debug CONFIG+=qml_debug
…/armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/features/toolchain.prf:76: Variable QMAKE_CXX.COMPILER_MACROS is not defined.

I try removing options until I find the one that breaks it and... now it's always broken! Even manually running qmake-armhf, like I did earlier, stopped working:

$ rm .qmake.stash
$ qmake-armhf -makefile
…/armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/features/toolchain.prf:76: Variable QMAKE_CXX.COMPILER_MACROS is not defined.

Debugging toolchain.prf

I tried purging and reinstalling qtcreator, and recreating the chroot, but qmake-armhf is staying broken. I'll let that be, and try to debug toolchain.prf.

By grepping gcc in the mkspecs directory, I managed to figure out that:

  • The } else: gcc|ghs { test is matching the value(s) of QMAKE_COMPILER
  • QMAKE_COMPILER can have multiple values, separated by space
  • If in armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/arm-linux-gnueabihf/qmake.conf I set QMAKE_COMPILER = gcc arm-linux-gnueabihf-gcc, then things work again.

Sadly, I failed to find reference documentation for QMAKE_COMPILER's syntax and behaviour. I also failed to find why qmake-armhf worked earlier, and I am also failing to restore the system to a situation where it works again. Maybe I dreamt that it worked? I had some manual change laying around from some previous fiddling with things?

Anyway at least now I have the fix:

--- a/cbqt
+++ b/cbqt
@@ -248,7 +248,7 @@ QMAKE_RPATHLINKDIR += {chroot.abspath}/lib/arm-linux-gnueabihf
 QMAKE_RPATHLINKDIR += {chroot.abspath}/usr/lib/arm-linux-gnueabihf
 QMAKE_RPATHLINKDIR += {chroot.abspath}/usr/lib/

-QMAKE_COMPILER          = {chroot.arch_triplet}-gcc
+QMAKE_COMPILER          = gcc {chroot.arch_triplet}-gcc

 QMAKE_CC                = /usr/bin/{chroot.arch_triplet}-gcc

Fixing a compiler mismatch warning

In setting up the kit, Qt Creator also complained that the compiler from qmake did not match the one configured in the kit. That was easy to fix, by pointing at the host system cross-compiler in qmake.conf:

 QMAKE_COMPILER          = {chroot.arch_triplet}-gcc

-QMAKE_CC                = {chroot.arch_triplet}-gcc
+QMAKE_CC                = /usr/bin/{chroot.arch_triplet}-gcc

 QMAKE_LINK_C            = $$QMAKE_CC
 QMAKE_LINK_C_SHLIB      = $$QMAKE_CC

-QMAKE_CXX               = {chroot.arch_triplet}-g++
+QMAKE_CXX               = /usr/bin/{chroot.arch_triplet}-g++

 QMAKE_LINK              = $$QMAKE_CXX
 QMAKE_LINK_SHLIB        = $$QMAKE_CXX

Updated setup instructions

Create an armhf environment:

sudo cbqt ./armhf --create --verbose

Create a qmake wrapper that builds with this environment:

sudo ./cbqt ./armhf --qmake -o /usr/local/bin/qmake-armhf

Install the build-dependencies that you need:

# Note: :arch is added automatically to package names if no arch is explicitly specified
sudo ./cbqt ./armhf --install libqt5svg5-dev libmosquittopp-dev qtwebengine5-dev

Build with qmake

Use qmake-armhf instead of qmake and it works perfectly:

qmake-armhf -makefile
make

Set up Qt Creator

Configure a new Kit in Qt Creator:

  1. Tools/Options, then Kits, then Add
  2. Name: armhf (or anything you like)
  3. In the Qt Versions tab, click Add then set the path of the new Qt to /usr/local/bin/qmake-armhf. Click Apply.
  4. Back in the Kits, select the Qt version you just created in the Qt version field
  5. In Compilers, select the ARM versions of GCC. If they do not appear, install crossbuild-essential-armhf, then in the Compilers tab click Re-detect and then Apply to make them available for selection
  6. Dismiss the dialog with "OK": the new kit is ready

Now you can choose the default kit to build and run locally, and the armhf kit for remote cross-development.

I tried looking at sdktool to automate this step, and it requires a nontrivial amount of work to do it reliably, so these manual instructions will have to do.

Credits

This has been done as part of my work with Truelite.

09 June, 2022 10:15AM

June 06, 2022

hackergotchi for Norbert Preining

Norbert Preining

Modern world

Just got reminded of this great short movie!

How befitting.

06 June, 2022 11:42PM by Norbert Preining