<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Subtitles on Nelson Chen's Blog</title><link>https://mindflakes.com/tags/subtitles/</link><description>Recent content in Subtitles on Nelson Chen's Blog</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><copyright>Copyright Nelson Chen</copyright><lastBuildDate>Tue, 16 Jun 2026 15:35:00 -0700</lastBuildDate><atom:link href="https://mindflakes.com/tags/subtitles/index.xml" rel="self" type="application/rss+xml"/><item><title>Agent-assisted bilingual subtitles for group watches</title><link>https://mindflakes.com/posts/2026/06/16/agent-assisted-bilingual-subtitles/</link><pubDate>Tue, 16 Jun 2026 15:35:00 -0700</pubDate><guid>https://mindflakes.com/posts/2026/06/16/agent-assisted-bilingual-subtitles/</guid><description>&lt;p&gt;I made some agent-assisted subtitle tools for group watches.&lt;/p&gt;
&lt;p&gt;The short version is that I wanted one subtitle track with English and Chinese at the same time. Some people read one language faster, some read the other faster, and switching subtitle tracks is not a group activity.&lt;/p&gt;
&lt;p&gt;No repo link for this one. The subtitle files are not mine to redistribute, and the interesting part is the workflow anyway.&lt;/p&gt;
&lt;h2 id="the-group-watch-problem"&gt;The group watch problem&lt;/h2&gt;
&lt;p&gt;The simple version sounds extremely simple:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Take an English subtitle file.&lt;/li&gt;
&lt;li&gt;Take a Chinese subtitle file.&lt;/li&gt;
&lt;li&gt;Put both lines into one subtitle cue.&lt;/li&gt;
&lt;li&gt;Watch the thing.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Unfortunately, subtitle files are not just text. They are text, cue numbers, timestamps, line breaks, release-specific offsets, missing cues, extra signage cues, and sometimes a drift that slowly changes across the whole movie.&lt;/p&gt;
&lt;p&gt;The first version I tried was basically a mux. If the English and Chinese files had matching cue numbers, use the English timestamp and stack the text together.&lt;/p&gt;
&lt;p&gt;That worked well enough for one case. It was also obviously fragile.&lt;/p&gt;
&lt;h2 id="the-annoying-version"&gt;The annoying version&lt;/h2&gt;
&lt;p&gt;The more interesting case had two subtitle tracks that did not line up with one fixed offset.&lt;/p&gt;
&lt;p&gt;If it were just &amp;ldquo;add 12 seconds to every Chinese cue,&amp;rdquo; this would not be worth writing about. But the offset changed over time. It started around one value and gradually moved toward another.&lt;/p&gt;
&lt;p&gt;So the script got more specific:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;parse both subtitle files&lt;/li&gt;
&lt;li&gt;treat the English track as the timing source&lt;/li&gt;
&lt;li&gt;search for the best Chinese offset in local windows&lt;/li&gt;
&lt;li&gt;smooth those offsets&lt;/li&gt;
&lt;li&gt;remap Chinese cues onto the English timeline&lt;/li&gt;
&lt;li&gt;merge Chinese text into the nearest overlapping English cue&lt;/li&gt;
&lt;li&gt;preserve Chinese-only cues as their own events&lt;/li&gt;
&lt;li&gt;write an alignment report for things that looked suspicious&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The report was the important part. I did not need the script to pretend everything was perfect. I needed it to tell me where to look.&lt;/p&gt;
&lt;p&gt;The useful output was not just the merged &lt;code&gt;.srt&lt;/code&gt;. It was also a short review list:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;low-confidence merges&lt;/li&gt;
&lt;li&gt;Chinese-only cues&lt;/li&gt;
&lt;li&gt;English cues with no matching Chinese text&lt;/li&gt;
&lt;li&gt;local offset anchors&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That made the workflow tolerable. Generate, skim the report, spot-check the weird timestamps, adjust the heuristics if needed, and regenerate.&lt;/p&gt;
&lt;h2 id="where-agents-helped"&gt;Where agents helped&lt;/h2&gt;
&lt;p&gt;This is exactly the kind of small, specific tool where agents are useful.&lt;/p&gt;
&lt;p&gt;I did not need a general subtitle product. I did not need a GUI. I did not need a library that handles every movie ever made. I needed one script for one group watch, and then a slightly better script for a different subtitle mess.&lt;/p&gt;
&lt;p&gt;An agent can help make that cheap enough to bother with.&lt;/p&gt;
&lt;p&gt;The first pass can be dumb:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;same cue number -&amp;gt; same timestamp -&amp;gt; Chinese line + English line
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then the next pass can be less dumb:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span style="display:flex;"&gt;&lt;span&gt;local timing windows -&amp;gt; estimated offset -&amp;gt; remapped cues -&amp;gt; overlap match -&amp;gt; review report
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The agent also made it easier to keep the code disposable. I did not have to lovingly design a subtitle framework. I could just ask for the thing I needed, run it, look at the bad spots, and ask for a better report.&lt;/p&gt;
&lt;p&gt;If you want to do the same thing, point &lt;a href="https://developers.openai.com/codex/"&gt;Codex&lt;/a&gt;, &lt;a href="https://code.claude.com/docs/en/overview"&gt;Claude Code&lt;/a&gt;, or &lt;a href="https://antigravity.google/"&gt;Google Antigravity&lt;/a&gt; at this post and your own subtitle files. The post is basically the spec: one timing source, one translated track, local offset search, merged output, and a review report. That is enough for an agent to build the throwaway script for your specific case.&lt;/p&gt;
&lt;p&gt;Anyway, it worked for the group watch. We had a very good time, nobody was lost or left behind!&lt;/p&gt;</description></item></channel></rss>