Lessons Learned from the Friends Episode Tagging Project

As I was tagging all the episodes of Friends, lots of thoughts starting running through my head: Was I doing this right? Could this be done better? What about this or that edge case?

I started writing them down, and this is the result – a series of “lessons” I learned. Now, to be fair, I don’t have answers for a lot of these, so many are really just “open questions.”

I think tagging is such an amorphous and idiosyncratic method of organization, that it defies any hard boundaries. For every situation, a tagging architecture is whatever you need it to be in that moment.

As such, take this all with a grain of salt. Do what you need to do, but maybe some ideas below will be helpful along the way.

Finally, in the text below, if something is a blue, stylized tag that is clickable (ex: marcel) then that was a tag I actually used and that you can click to see results. However, in some cases, I talk about theoretical tags I did not actually use, so I formatted them differently to not waste your time clicking on things that weren’t actual tags (ex: not-a-real-tag).

Basic theory and definitions

At it’s most basic level, tagging something means that we associate it with a larger concept. We’re saying Item X is somehow related to Concept Y (in the form of a tag).

In its most pure form, we’re not commenting on how these things are related, just that they are – Item X is somehow a member of the set identified by Tag Y. (I’ll talk more about the relationship to set theory below.)

Some definitions (for the sake of this document only – I don’t claim these are universal or accepted)

Item: a content object to which we apply a tag. This is your content.
Tag: a label we apply to one or more items; in most applications, this label is simply a short bit of text, but other implementations might have descriptions attached; a tag is not standalone – it has no reason to exist if not applied to at least one item. In most cases, tags are ad hoc, simply invented on-the-spot by whomever is doing the tagging.
Tag Assignment: the singular application of a specific tag to a specific item. The same tags can only be assigned once to the same item – there is no point in applying a tag more than one time. The first time a specific tag is assigned effectively creates that tag. When the last assignment of a specific tag is removed, that tag ceases to exist.

flowchart LR
    I1[Item 1]
    I2[Item 2]
    I3[Item 3]

    T1[Tag A]
        T2[Tag B]
            T3[Tag C]
                T4[Tag D]


    I1 --> T1
    I1 --> T3
    I1 --> T4

    I2 -- Tag Assignment--> T1
    I2 --> T2
    I2 -->T4

I don’t mean to get pedantic about definitions, because tagging content is literally one of the simplest and most basic information architecture models. However, in some places below, I need to refer to some structural aspects, and I wanted to baseline a vocabulary.

How are tags different than categories?

Well, it depends on how you define “categories.” (But, as I’ll explain below, the definition of “tagging” isn’t much more exact.)

However, if we consider a default/standard tagging implementation, here’s the difference:

Categories are defined in advance by an administration/management team, and cannot be extended on-the-fly at the editorial level. Categories are are a “top-down” architecture.
Categories are often (usually?) hierarchical. They exist in a parent-child relationship, with categorization proceeding from broader to narrow as you move deeper into the category hierarchy.
Given the difference of #1, the interface is usually different. Tags are entered as text (sometimes with auto-suggest or auto-complete), while categories are selected from a pre-existing list, so the UI is normally a dropdown, a series of checkboxes, or sometime similar.

Given these differences, can/should you have tags and categories in the same implementation? Or do you just need to pick one and stick with it.

…I don’t know. I’ve tried having both on a couple of projects, and never figured out how to differentiate them in editors’ heads. In several cases, when editors were talking to “tagging” and “categorizing” during project planning, they were just using different names for the same thing.

It might be valid to use categories for primary “navigational” differentiation, the use tags for more ad hoc “related content” type connections (you might not even display the tags in this case, instead just display the content the tags build a relation to). I’ve never done this, but this is the type of clear differentiation you would need for it to be intuitive for editors.

What does applying a tag say about an item?

When you apply a tag, what are you saying?

This Item is fundamentally about the concept represented by this Tag
This Item has some connection, however tenuous, to this Tag

How this relationship is implied depends on what the tag represents. The tag ross-rachel meant, “This episode advanced the storyline of the romance between Ross and Rachel.” While the tag treeger just meant, “Treeger appears in this episode somewhere.”

The “tag subject” is so variable. It might be a simple noun (in the case of Treeger), or it might be a larger framework of concept (in the case of the relationship between Ross and Rachel, which was probably the most enduring storyline of the entire series).

The former is more all-encompassing than the latter. The latter is basically trivia, while the former usually always mean one of the storylines was fundamentally about Ross and Rachel and the contents of that storyline would impact further episodes.

The problem comes when multiple editors are tagging independently. Do all the editors have the same concept of what applying a tag means? If Editor A is tagging on trivia and Editor B is tagging on the gestalt of the episode, then you’re going to end up with tags at wildly different altitude.

I came to call this the “tag scope.” The scope of a tag is the framework or perspective through which it is applied. It’s the heuristic a specific editor is using to analyze an item.

Should tag scopes be segregated or somehow namespaced?

This is quite related to the tag-relation question above – as I assigned more and more tags, I got to wondering if it would be helpful to group or “namespace” tags according to their scope.

For example, any episode could be tagged along several different scopes.

The appearance of some thing or concept: marcel
A continuation of a storyline: apartment-switch
The type of episode: bottle-episode

Should I have specified a way to separate tags into these buckets, meaning I somehow qualified the assignment? For example, should I have a convention where I namespaced the tags with a prefix: appearance-marcel, plot-apartment-switch, type-flashback.

Or, should I have different fields for this? Instead of a single field for “Tags,” should I have “Plot Tags”, “Appearance Tags”, and “Type Tags”?

This is one of those thing what would make me feel good, for sure, but I just don’t know how it would help in this particular situation. Perhaps if a lot of people were collaborating on tags, it might help them put a mental framework around it? Like, it would help guide them to what should be tagged?

On the front-end, user-facing side, would I use the namespace to label the tags differently? Perhaps, use a different emoji for each type? Would this help the user understand the scope of a tag assignment?

Would namespaces help me avoid collisions? I’m searching for a practical example in this body of work, but I’m struggling to come up with one. However, in some other domain, you might have politics-china and art-china to refer to very different things, where just china would have caused a naming collision problem.

(However, see below for some thoughts on “tag querying” that would influence this problem.)

But, finally, does this somehow violate the ethos of tagging. We’re imposing some structure on tags, which makes it drift into other organizational schemes. Consider large-scale episode types? Here’s the list:

flashback
bottle-episode
season-premiere and season-finale
two-parter along with part-one and part-two

…that’s about it. If we segregate those tags into their own structure, then why not make it a dropdown, or a category assignment? Is there some value in tags being complete “greenfield” selection? If we tighten up the scope, are we defeating the point?

When is an item “tag worthy”?

When does the relationship between…whatever, and a tag rise to the level of applying a tag?

For example, the tag chandler-monica means that the episode somehow advanced the relationship between Chandler and Monica….or does it? Should I have applied it to any episode that involved Chandler and Monica in a relationship?

There are kind of three “levels” that situation –

Chandler and Monica appear in an episode
There is some reference – either spoken or visual – to Chandler and Monica being in a relationship
One plotline of the episode is centered about the progression of their relationship (“progression” is important here – a plotline could involve them, but not be about them, if that makes sense)

#1 is silly because both of them are in (almost) every episode (see below for “common-ness” as a problem).

#2 would be important duing a specific period of time when their relationship was…novel; when it was new and a specific story arc. However, as the relationship worse on, #2 really didn’t apply much because the relationship kind of…settled (?) into the background.

Which means we’re left with #3 – episodes that very specifically involved some aspect of their relationship.

When Chandler and Monica started their relationship in London, and when they were subsequently hiding it from everyone, it was interesting and a key plotline. Once it was out in the open, and after they got married, that plotline… settled, a bit. It kind of faded into the background and was no longer noteworthy.

The larger point here is that a concept of theme is “tag worthy” only in relationship to the larger context. Something might have been tag-worthy at one point in time, and not in another.

What is our tolerance for error or omission with tagging?

When you tag something, you’re making an implicit claim: all things like this thing are also tagged. In a perfect world, a tag for ross-drinks-something would implicitly claim to represent every possible instance of Ross having a beverage for the entire series.

But, unless you go incredibly deep on something, checking and double-checking, and making having multiple people with the same depth as you review you assignments, you’re probably going to miss some things.

So, is tagging a situation where it has to be right or else shouldn’t be done at all? Or do we tacitly accept that tagging is just more… chill, and less precise than other methods.

I had various freakouts before releasing this project to the world. I remember checking the tag susan-rival and seeing that it was only assigned once. I found myself consulting AI to find other episodes where it applied, because I was… embarrassed, that I might have gotten this wrong. (Turns out, I had missed quite a few, and ChatGPT was weirdly good at finding them…)

Should we have some “draft” status for tags so that they’re validity can be reviewed before publication, or is part of the concept of tagging that it’s looser than other categorization methods? When we tag something, is it a “bonus” – meaning, we hope this is accurate, but it’s just a bonus, so if it’s not, no foul?

It probably depends on what you’re depending on tags for. If you’re using them as primary navigation, it would clearly become a problem. But in my experience, tags have always been used for bonus or ancillary navigation, so it’s less critical.

When is something too common to tag?

Earlier I talked about tagging treeger. This is because Treeger only appeared in five episodes, so it was vaguely interesting whenever he showed up.

Consider any of the six main cast. Should we have tags for rachel, ross, joey, phoebe, monica, and chandler?

Clearly, no. Five of them were in every one of the 236 episodes, and Matthew Perry only missed one of them in Season 9 because he was in rehab (though his voice was heard on the phone).

How about central-perk? Again, probably not, because it was in 210 of the 236 episodes, so it would just be… “tag spam.”

But consider Gunther. Do we tag his appearances? He appeared in 150 episodes (64%, about two-thirds). To my recollection, Gunther was rarely important to any plot point (one major exception: he told Rachel that Ross cheated on her, but this happened off-screen). At best, Gunther was mostly… set dressing? He would show up and say a joke or two, but that’s about it.

So, do I tag every episode with Gunther? At what point does something just become… background noise? (Sorry, Gunther. RIP.)

When do you have to explain the tag assignment?

The reasoning for some tags might not be obvious. If I tag something with guest-star, for example, the immediate question is: who was the star?

I solved this by allowing for some explanation – if a tag has a little star icon on it (⭐), then you can mouseover it for more information (doesn’t work on mobile, sorry). I would use this with guest stars to explain who the star was.

I did it a few other times, like to explain a monica-obsessive assignment when it just happened in the end credits. I also did it a couple of times with real-world-person.

It’s handy, but I don’t think this is common in most tagging systems. Tagging is usually binary – a tag is applied, or it is not. (Below, we’ll talk about all sorts of ways to “corrupt” that purity, for better or worse.)

What we’re talking about is adding metadata to the tag assignment, which is pretty rare. From a technical perspective, to do this would require treating the assignment as a manageable object itself. There might be some other benefits to this – for instance, you could track the date it was applied, for instance, and its status, if you didn’t want to delete it, but rather just “deactivate it.

Can tags ever be hidden for administrative organization or search biasing?

In this project, all the tags were open and available for review and browsing, but in some projects, I’ve had hidden tags. These were usually for search biasing, but a few times I used for administrative organization as well.

For example, a tag of this-writing-sucks is probably not something you want to display to the content consumer, but a report of all content tagged like that can be a handy way to create a “work list” to manage tasks and keep things organized.

To “hide” tags I’ve done one of two things:

Had a separate field for “admin tags”
Allowed parentheticals. So, if you put a tag in parenthesis, it was assumed to be “hidden” and would be handled as such. For example: politics, foreign-policy, (clearly-written-when-bob-was-drunk)

Which brings me back to the question: are hidden tags still tags? I think so. If you’re only using them for search biasing, they tend to be called “keywords,” but if they function the same as tags for all other purposes – they’re just not displayed to the end user – then that naming is fine.

Could we apply an intensity level to tags?

What if tagging something wasn’t binary, but a matter of degree? Would there be any value in applying an “intensity level” to a tag?

For example, if we were apply a tag of joey-actor, we could apply it at a low intensity for the episode where Joey misses the Days of Our Lives float in the Thanksgiving parade. Technically, that happened because Joey was an actor, but the plotline is more about Joey being kinda dumb and Phoebe trying to teach him to lie (“A raccoon…!”)

But what about the episode when Rachel and Joey go to the Daytime Emmy Awards? That’s much more “about” Joey being an actor than the Thanksgiving parade. What if we could apply…more “tag-ness” to it: joey-actor:3 So, we’re applying 3x the joey-actor -ness than a normal tag assignment?

This wouldn’t be hard technically, and the colon-based example from above would be fine to annotate it. But what would we do with this information?

…I have no idea? When we list the items assigned to that tag, we could order them by this value in reverse (an assignment with no specified intensity would be assumed to be 1). In that case, the episode that were really about something would be listed first.

However, “balancing” intensities would be tricky because we’d be asking multiple people to coordinate opinions on intensity. Or we’d be asking one person to know about all potential intensities to form a mental scale in advance.

For example, say if you were to leave Item 13 at the default intensity for something, but then you decide that Item 43 was 2x the intensity. That’s fine until you come to Item 96, which is more intense than 13 but less intense than 43. Can we do an intensity of 2.5?

I feel like you’d have to keep it very simple.

Assume a “2” value if no intensity was specified
Use a “1” if the tag only sort of applies
Use a “3” is the tag really applies

Although, on that simplified scale, we might just use a “+” for more intensity and “-” for less. So, joey-actor-, joey-actor, and joey-actor+

This is one of those things that might be a feature in search of a use case. However, it would help solve the “tag indecision” problem. In some cases, I would have been much more likely to tag something that was “on the bubble” if I knew that I could specify an intensity level.

My problem was that I almost didn’t want to tag the episode when Joey missed the parade, because I would think about other episodes that had an higher intensity of joey-actor-ness, and I’d think this wasn’t equivalent. I might have been more likely to tag because by leaving it at the default intensity (and giving a corresponding raise to the other episode), I could accurately represent how I felt about it.

What does it mean when something has no tags?

There were a handful of episodes to which I didn’t assign any tags. Does that mean nothing happened in those episodes? Well, no, it just means nothing happened that merited a tag assignment …in my subjective opinion, based on my natural tag selecting tendencies, and based on the collection of tags I had created to that point.

Looking at all the notes above this one, there were some episodes that involved things that didn’t merit a tag, either because they were ubiquitous, or unique, or otherwise didn’t rise to the right level. If we decided to tag Gunther, for example, then some untagged episodes would get tagged for him.

Still, I always felt badly about it. And if I watched all the episodes in detail, I might find I missed something, or might come up with a way to tag them – some aspect of the episode that I could turn into a tag, but I didn’t want to manufacture tags for their own sake.

Should we provide a query language for tags?

Searching for items associated with a single tag is simple and straightforward. However, should we allow users to search for items based on a “query” of different tag status?

For example –

If I want to see items about Paolo – the Italian guy from the first season – I can search for paolo. If I want to see items about Phoebe’s career as a masseuse, I can search for phoebe-massuese. But what if I want to see if those two things intersect?

And they did – in The One with the Dozen Lasagnas, Paolo hit on Phoebe while she was massaging him, leading to the end of his relationship with Rachel. How would we search for this?

To this end, do we need a tag query language? Could we search for paolo + phoebe-masseuse to refer to that intersection? Could we go further –

paolo AND phoebe-masseuse to find items that have either of those tags
bing-adoption NOT erika to find items about Chandler and Monica trying to adopt before they met Erika (the eventualy adoptive mother).
Could we do parentheticals? If I wanted to see items that just involved Carol or Ben, without Susan, could I do (carol AND ben) NOT susan?

Then this led me to another thought –

If we provided a query language, would this affect how we did tag assignments?

If we did allow tag querying, I feel like this would fundamentally change how we were able to tag things. A lot of tagging boils down to trying to make sure someone can find something. However, with a query language, we would be much less concerned with intersections.

If I tag something las-vegas and ross-rachel, I might still tag it with ross-rachel-marriage, because that speaks to a specific intersection.

However, with a query language, someone with enough domain knowledge could query for las-vegas AND ross-rachel to find this, since their drunken marriage was their major plotline in Vegas.

(This is a little contrived, because it was hard to find an example in this project. But consider a blog post that I tag with history and technology. There’s a specific intersection there: history-of-technology. If I have a query language, that’s implicitly covered by the query history AND technology.)

However, this lead me to yet another point –

How much are tags about “exploration,” rather than just organization?

Do we tag things to search them? Or do we tag things to explore a domain of information?

I feel like tags are… opportunistic. People are reactive about tags, not proactive. They’re not going to search tags. Rather, they’re going to see one, be reminded of some aspect of the content they’re consuming, and want to see what else fits into that tag.

So, if we did provide a comprehensive tag query language, would anyone use it?

And this got me thinking that –

Could tags be algorithmic, meaning they query “behind the scenes”?

Here’s one of the fundamental aspects a tag: it’s a promise to produce content related to the tag. The default model is that the tag produces content that has been proactively assigned to the same tag – this is how tags are presumed to work.

But what if we supplemented that model with algorithmic tag assignment or item retrieval?

To go back to the example above, what if, when we were rendering tags, we detected that an item was tagged with both history and technology, so we automatically added history-of-technology?

We could handle the search two ways:

If we actually added this as a “true” tag on item save, then there would presumably be others tagged with the same thing, so it would just work normally. This would be “supply side” retrieval.
Alternately, we could bind certain “auto tags” to a tag search, so if someone tried to access history-of-technology, we would actually produce the items with a tag query search for history + technology. This would be “demand side” retrieval.

Or what if we supplemented tag assignment on save? Meaning, when the content item was saved (for the purposes of applying tags or just a regular edit), we could process the item to auto-add some tags:

We could search the item to find references to keywords, like marcel. If the description or text of an item included “Marcel,” we could check for that tag, and auto-assign it if it didn’t exist.
We could assign tag intersections. If something was tagged ross-rachel and las-vegas, we could look for and auto-add ross-rachel-marriage.
We could use AI or some other algorithmic process to check for tags that might be missing (see below for more on this).

The beauty here is that this is entirely supply-side, meaning it’s just an editorial automation. This wouldn’t change anything on the user-facing side.

How important is tag naming, and is that static? Or can/do tags evolve over some axis of an information domain’s evolution?

Domains of information might change over time. In the case of this project, the “Friends universe” evolved over 10 years and 236 episodes, so naming things can get tricky as those things changed in nature and definition.

Ron Leibman (RIP) played Dr. Leonard Green, Rachel’s father, who did not like Ross and wasn’t shy about letting him know (“wethead”). He was a notable, recurring character – any episode he was in always had a plotline just for him.

But what do we call him? rachels-dad or dr-green or do we lump him in with Rachel’s mother as rachels-parents? We could go even further and say rachels-family, but there were multiple, notable episodes involving both of Rachel’s sisters (Christina Applegate won an Emmy for one of them), so do we break them out?

Clearly, the core question is, which name/label do users identify with more? When a user sees this on a tag, which one are they going to recognize?

But there’s another interesting angle here –

Note that rachels-dad is relational. It frames Dr. Green as having an identity only in relationship to another character. He matters only because of his connection to Rachel.

However, let’s pretend for a second that Dr. Green fell in love with Chandler’s Mom, and they both became regular characters in the later seasons. Can he still be rachels-dad? Or has his usage in the show morphed into something self-sustaining, and that doesn’t need Rachel as an “anchor”?

Another example is ross-professor. He was only a professor in the later seasons (after he stopped working at the museum). This was a notable career change, because it drove some story arcs (the entire elizabeth arc and a non-trivial part of the charlie arc). So the ross-professor tag is really a transformation of the ross-museum tag – its refers to the same concept and characterization – that of Ross having a career that requires a high-level education, but they’re two separate tags.

Maybe that’s only specific to this domain of information – a narrative that continues over time – but the phenomenon of a tag changing context and meaning could be more prevalent.

Interestingly, I would run into this on some guest-star taggings. Consider Ellen Pompeo as Missy Goldberg – Ross and Chandler’s college crush in Season 10. She was not a “star” when this aired in early 2005. Grey’s Anatomy was a few months away from its premiere. So, does she get tagged for being a guest star when she wasn’t at the time?

The larger question: is it accepted that tagging something is only valid at the moment of tagging, and is subject to… “assignment decay” after that? When we tag something, are we freezing time to say that Item X thing was related Concept Y at the time Item X was created?

And this led me down another rabbit hole –

Should tags be related to each other in some larger architecture?

One of the common models of tagging is that there is just a big pool of them. All tags are equal. This is one of the differentiators I cited above to separate tags from more formal categorization.

But you quickly run into thoughts about structuring tags into some larger framework. For example:

There’s a tendency to want to apply some hierarchy. Should I roll up ross-museum and ross-professor into ross-career? I could provide some navigation either way: if you looked at either of the narrower tags, I could direct you to the broader one. If you looked at the broader one, I could show items from both of the narrower ones.

I had the exact same problem when I realized that Monica and Chandler’s wedding wasn’t tagged with wedding. Rather, I realized it was tagged with bing-wedding. So, I had to tag it with both the broader and narrower concepts.

I found myself wanting to provide some form of related tags. I almost wanted to tag tags. Like grouping mike, mona, kathy, etc. into a “super tag” of relationships. This is not hierarchy, really, because relationships might not be the “parent” of all of them – they might easily fit into other super tags. For instance, I could conceivably put david and kathy into a super tag of infidelity (remember Phoebe’s kiss with David?)
As mentioned above, tag names can change with time, so could we specify that two tags exist on some kind of continuum? There are a lot of episodes tagged with richard, and then some tagged with richard-breakup, because Monica spent about as many episodes recovering from the breakup as she did in the relationship. Should I represent this somehow? Like, somehow specify that richard-breakup is a “continuation” of richard?

The problem of tag consistency (discussed more below) could be helped by linked variations of the same general idea. If you have more than one than one person tagging, it might be handy to be able to say that joey-actor, joey-career, and joey-acting are all the same thing. There might be better ways of solving this (like catching and consolidating those on save), but this is one option.

Some of this might come back to a tag query language, as I mentioned above. With a query language, a lot of these relationships become “pseudo tags” that just represent combinations of other tags.

Should a tag ever only have one assignment?

If something is tagged one time, then what’s the point? If I can’t view other things tagged that way… should the tag exist? Should we suppress one-off tags from the list of tags? Should a tag only become “active” when more than one thing is tagged?

When we tag something, are we saying:

“This thing is about this tag”
“Click this tag to find more things like this thing”

Are we just labeling a body of information, or are we providing exploration and navigation opportunities?

Single-use tags are inherent in the “bottom up” nature of tagging. Put another way: every tag is single-use at some point, because we tag incrementally. The first time I applied a tag, that was the only place it was assigned, until I applied it somewhere else.

And this leads us to another point: when we apply a tag for the first time, are we naturally assuming we will apply it somewhere else?

I’m very familiar with this particular body of information. So when I tagged ross-rachel for the first time in Season 1, Episode 1, I did so knowing that I would tag many future episodes the same way. So, is that why I tagged it? Because I knew this was a recurring thing?

In Season 8, Episode 14 – The One With The Secret Closet – you see the “fourth wall” for the one and only time in the entire series. There’s a shot of Joey and Chandler standing in front of the titular closet, from the perspective of the closet, and you can clearly see the presumed wall that all the cameras shoot through (it’s purple, it turns out. Here it is.).

This is the only time that happened. Should I have tagged this fourth-wall? When I tagged that episode, I knew this moment happened, and I also knew that it never happened before or since. Is this why I didn’t tag it as such?

(Honestly, I don’t remember my actual thought process at the time, but this is what I recollect…)

If I saw the fourth wall again, would I have thought to myself, “I’ve seen that once before. So that’s two. I need to go back and tag the first occurrence. It’s like that kid’s game Concentration. You flip over a card and think, “I’ve seen this picture somewhere before,” and you go looking for it.

For an even more obvious example, I never tagged the first episode with series-premiere or the last episode with series-finale, because what would be the point? There’s clearly only one of each, by definition. (For that matter, why not tag Season 6, Episode 22 with episode-144?)

Also, maybe you’re just tagging something descriptively? If I tagged an episode as tear-jerker then I’m saying something about the episode, even if I’m not connecting it to anything. It’s a label I’m slapping on it – a channel to provide a dimension or perspective.

You see this in social media all the time: tags as simply social commentary, without the expectation that they will connect to anything.

One other thing I noticed: a single-tag assignment always has a non-trivial chance of just being a misspelling of another tag. A second assignment is essentially a confirmation of the validity (the spelling, really) of a tag. No second assignment means that it has not been confirmed.

The problem of tag consistency

I had an inherent advantage in this situation, because I was the only person doing the tagging. This led to some consistency on two levels –

Active consistency, in the sense that I can remember tags I’ve applied and apply them again
Passive consistency, in the sense that all tag applications are the product of a single mind that works (relatively) the same way from evaluation to evaluation – the same mind is looking at and synthesizing Item X, Item Y, and Item Z

But, clearly, even I made mistakes – forgetting I used a tag previously, or misspelling it, or thinking up some new angle (I used both phoebe-old-life and phoebe-past-life, and I spelled “masseuse” at least four different ways before just going with phoebe-massage).

Tags, by definition, don’t have a central authority behind them. People can make up any tag they want. (Some scenarios might force people to use a specific set of tags, but I would argue those are tags anymore, those are categories.)

Also, if you have more than one person tagging, there might be little or no coordination between them. They could use different tag names, or different tag scopes (see above for more on tag scopes).

Someone might just apply one tag per episode, thinking that they should consider the episode in its entirety, and sum it up in a single tag. Someone else might be looking at every individual plotline per episode (usually two or three), and also any trivia or random appearances of things or concepts.

Some solutions for consistency that I’ve seen –

Autocomplete. When users start typing a tag, they see tags that match what they’re typing. This helps with misspellings, and avoids people invented new tags, but it’s necessarily dependent on how the tag is spelled, and how well you can match it based on that. What’s trickier is to autocomplete the concept, not just the spelling.
Restrictions about tag invention. Some systems will prevent most users from creating an new tag. To use a tag for the first time, a user has to clear some bar – either be placed in a specific user group, or have contributed X number of content items, or some other rubric to figure out if they know what they’re doing.
Tag suggestions. Some systems might examine the item being tagged and suggest tags based on existing tags applied (“Items tagged sports-car are often also tagged with cars”) or based on the content itself (“Content like this is often tagged politics”). This is essentially some level of system-powered organization, but it’s just asking for the user’s permission. If they don’t take that suggestion (give their permission), do you tag it anyway? Or do you do this on the consumption side – when someone is viewing the items assigned to the sports-car tag, do you say something like, “You might also be interested in the cars tag?”
Human intervention: Some systems (using that word to include human-powered processes) have human editors who review new tags (all tags?) to make sure they’re assigned correctly. This group (cabal?) of editors presumably communicates about consistency, discussing new and emergent tags and how they should be handled.

Is tagging just some version of set theory?

Set theory is a long-standing concept in mathematics. The mathematician Georg Cantor once said:

A set is a gathering together into a whole of definite, distinct objects of our perception or of out thought – which are called elements of the set.

He did not say the following, but it still makes sense:

A set is a Many that allows itself to be thought of as a One.

And this got me thinking: is tagging basically an inverted form of set theory? (Technically, Naive Set Theory.)

When we tag something, we’re assigning it to a set of Things That Have That Tag. That set might already exist, or our first tagging of something might make it pop into existence at that moment.

Set theory is the basic concept that allows us to refer to objects as a group. We make the claim that “the Many” can be thought of as a One. For example: ross-professor is a single reference to several different items, all which have set membership in common.

Consider the rules of set theory:

We can think up any set we want. Sure – we can invent any tag we want. This is one of benefits of tagging.
Set identity is determined by membership. This means the “set” (the tag) is not really a thing – it exists only to contain the objects in it (we might violate this a bit if we add metadata to a tag)
The order of objects in a set does not matter. The order of episodes in a tag doesn’t matter – we order them based on their air date.
Repeats don’t change anything. We logically wouldn’t allow something to be assigned to the same tag twice.
The union of any two sets is itself a set. See the bit above about tag querying – the result of a tag query would be another tag.
Any subset is a set. Same thing about the tag querying.
A set can have just one member. Sure. I did that all the time.
A set can have no members. Uh…this one we have a problem with. This works if tags exist in a pre-defined structure. But if tags are ad hoc, then they only exist because they have things assigned to them. If nothing is assigned to it, the tag does not exist.

How would we optimize a UI for tag application?

I feel like tagging is quirky in terms of editorial needs. A couple examples –

Tagging can be very asynchronous – you think up tags in the weirdest times
Tagging is contextual to other content – what you tag an item is often influenced by how other items are tagged, and by extension, what tags exist in the system

How do we manage this from a UI standpoint? Could we optimize a UI for tagging? In fact, could we develop tagging patterns that can be applied independently of whatever content technology is being used to manage the content?

The goal is to lower the “editorial friction” associated with tagging. A smaller, faster interface would be helpful. Instead of loading an entire editorial UI to tag something, it would be better to reveal a single textbox in one click or even a hotkey sequence. (If tags are stored as a property on the content object, you can still fully version the object after changing the tags.)

Additionally, when staring down a text prompt to generate tags, a couple other UI elements would be helpful –

As noted above, autocomplete would help you find tags that already exist, and ensure uniform spelling to minimize one-offs
Some ability to see what content has already been applied to a tag might help an editor determine if a tag is appropriate – something like a small, embedded search UI

Finally, it might be helpful to allow tagging from top-down in addition to bottom-up. Meaning, an editor could edit the list of items assigned to a tag and add or remove items from that perspective. So, the editor would be adding and removing items from tags, which is contrary to the standard mode to adding or remove tags from individual items.

Several times, I would see a tag, and realize there were other episodes I wanted to make sure were assigned to it. Or I might see a tag and decide that several things would be removed from it (perhaps I had thought up a way to subdivide the tag). In these situations, I would have had to individually edit each item, where it would have been easier to visualize the list of items applied to a tag and work from there.

Could we develop a “maturity model” of tag application and sophistication?

While it’s helpful to show a progression of maturity, models like this can be problematic because they assume everyone wants to go in the same direction. So understand, that the below is based on notion that you love all my rantings from above, and you want to push your tagging to the most absurd degree possible.

Simple assignment of text-based tags to content items via comma-delimited text. This is the most basic definition of “tagging.”
Editorial assistance for tag application, via autocomplete of existing tags.
The ability to specify tag synonyms or some other algorithm of tag consistency, with redirection/rewriting to the preferred terms on item save.
In-context editorial tag search and review from within the editorial UI
Editorial tag suggestion based on content analysis.
The ability to apply hidden tags to bias search or provide for administrative management.

(Note: to this point, all enhancements are purely at the editorial, supply-side level and have no effect on the simple tag architecture presented to the user, demand-side. From this point forward, we’ll… corrupt the purity of how the user perceives the tagging system.)

Metadata for tags, such as a description of the tag intention.
Metadata for the tag assignments, such an explanation of why the item was tagged as such
Tag segregation, via namespacing or separate property storage
Tag organization and relations – hierarchical, synonyms, related tags
Tag intensity levels (though, as noted above, I don’t even know if this is a good idea…)

How much of this is related to organization in general, not just tagging?

We’ve talked about how tagging and more formal forms of categorization are different. So, how much of these “lessons” apply to other forms of organization?

At the risk of being reductive, what we’re doing here is analyzing and synthesizing the content and reducing it to a specific label. What we’re talking about is getting at the “gestalt” of a piece of content, and grouping it based on that.

As, as we talked about above, this isn’t just on one scope. There are multiple ways to evaluate an item – plot, appearance, story arc, etc. In that sense, we’re picking through the content, plucking out aspects, and identifying them as significant. Those “points of significance” connect to other items.

So, how similar are the problems here to categorization, and can we draw parallel lines between them? …maybe. Since a categorization structure is set beforehand, it removes a lot of the “just in time” analysis. Those decisions (arguments?) happen before an editor evaluates something, which reduces the role of the editor to mapping items to existing concepts.

In some ways, a category “begs” for assignment. The existence of a category for fourth-wall would imply that this situation exists, and that it happens more than once in the series, or else why would it exist?

And how does this relate to a static body of information? Friends only exists in the past – no new episodes are happening, so if the entire series has been evaluated and put into a categorization structure… does that change anything?

It makes me wonder if one unique aspect of tagging is that it’s continual and always subject to refinement and alteration. Someone new could review the body of information and start applying tags based on astrological signs, for example. I’m not saying you would want this necessarily, but it speaks to the ad hoc, person, idiosyncratic nature of tagging as a product of individual minds, rather than a central authority.

Will AI make all of this irrelevant?

I got to wondering if this entire concept of manually analyzing content and figuring out a way to intersect them all with a tag structure is antiquated in the age of AI.

I have a paid plan on ChatGPT, so I had an exchange with where I asked it to tag all 236 episodes.

Here’s the transcript

You can see that we went back and forth about format for a bit, and after that I seemed like it was just summarizing the plotlines of different episodes. Like this:

laundry-date
joey-scheme
chandler-breakup

I told it that I wanted to group related episodes under common tags, and that seemed to resonate with it. It then presented me with a list of potential tags, and it wasn’t bad (in fact, I had already come up with matching tags for all of these):

ross-rachel
monica-chandler
joey-acting
phoebe-family
chandler-janice

Then it proceeded to tag the first season. It would have continued to tag them all, but the theory was proven so I didn’t take it any further.

So… will we still tag things manually in the future? Honestly, I have no idea. For a large domain of information, programmatic/AI-based tagging is clearly a very attractive option.

But, tagging goes beyond just the mechanics of it. Part of the attraction of tagging is that it’s a little idiosyncratic, and it somehow reflects the personality of the person doing the tagging. AI is, by definition, just a summarization of the models it was trained, which means AI is never really going to surprise us. As such, maybe it’s better for AI might be just a method of ensuring consistency (as noted above), while human assignment is the thing that provides the serendipitous scopes that delight people.

Conclusion

There really isn’t one. It was tedious and fraught with self-doubt. I did the best I could, learned a lot of things along the way, but somehow came out with more questions than when I started.

Beyond the basic concept of it being some form of set theory, tagging is whatever you want or need it to be.