A reflection on the Birmingham Screwdriver

Birmingham Screwdriver Company Art by Foka Wolf

This week, I delved into the intriguing world of the Birmingham screwdriver—a slang term for a hammer. It turns out that in the 19th century, Birmingham, a bustling hub of the Industrial Revolution, faced a similar bias to the one we’ve witnessed more recently toward China; there was a preconceived notion that everything produced there lacked quality due to unskilled labor. Due to this lack of skill, workers would use the same tool – supposedly, a hammer – for all purposes.

“If the only tool you have is a hammer, it is tempting to treat everything as if it were a nail.” – I had already heard this quote, which seems to be attributed to Maslow (the same person who created the hierarchy of needs), but the Birmingham historical background brings a different charm to the concept. Fast forward to 2024, I am often reminded of it by the highly skilled workers of the tech community. In tech, trends in tooling often resemble Birmingham screwdrivers. 

Take Scrum, for instance. It’s been around for almost three decades, yet only recently have people begun to grasp that agility is more about mindset than process. It’s about adapting your approach based on your team’s people and purpose. More recently, this notion struck me when, as a product manager, I noticed a push towards releasing everything as an A/B test. While this can be beneficial in certain scenarios, it’s not a one-size-fits-all solution. For instance, in my case with a small audience, running experiments can lead to long waits for meaningful results and legal compliance headaches. I fear that sometimes people may take refuge in experimentation to compensate for poor discovery (hence, higher risk).

Back to the hammer. Cognitive biases affect every profession, regardless of “skill” level. I use “skilled” in quotes because true expertise includes understanding our own cognitive distortions. Education often overlooks teaching us critical thinking and self-awareness. It’s not just about learning what others have theorized but stepping into the shoes of the philosopher, constantly questioning assumptions and beliefs. Discovering and addressing our biases not only makes us better professionals but also more empathetic individuals. We begin to see that certain behaviors stem from a lack of exposure to information or critical stimulation, rather than inherent flaws. As a parent navigating the challenges of raising a curious 4-year-old, I’m reminded daily of how hard it is to foster a questioning mindset, as the first thing she’ll do with it is to object my parental authority. So excuse me while I drown in despair over constant toddler meltdowns – I’m doing it for the greater good.

Beyond the hype: building LLM applications for production

It’s been nearly four months since we launched the first LLM-based feature in Talkdesk: call summarization. An evident first candidate given the relatively simple nature of the use case and the expected impact on agent productivity. While beta testing this new feature with a few selected customers, we moved on to a few other use cases, namely topic extraction, to provide the customer with a list of topics discussed in their calls or chats; and question answering, to support agents handling customer queries.

Many other use cases are already lined up, from automated agent evaluation to message writing helpers. But as a product manager’s mind navigates the sea of possibilities, it’s also important to take a step back and think about the lessons we’re learning along the way.

The ambiguity trap – how users write instructions

I’ve seen many posts arguing that LLMs will revolutionize user experience for the better. With users being able to express their intent using their own language, we can eliminate the learning curve when interacting with an application. I understand this claim, but I think it is somehow naive to think that people are able to always express themselves clearly and concisely. Just think of how many times you, as a human, have received instructions from other humans that were unclear or incomplete. As I read in another post about the challenges of LLMs – just doing what someone asks for isn’t always the right thing. Designing interfaces where user input is in natural language may increase accessibility, but will likely reduce the quality of the output, therefore increasing frustration.

We’ve seen this happen with one particular tool that we made available, where users can instruct the system to generate model training phrases for them by describing a specific user intent in text. This used to be a daunting manual task, so automation is welcomed. However, past the initial excitement, some users started asking the hard questions: what are the best practices to instruct the model for phrase generation? What sort of information needs to be included in the description? Ambiguity generates anxiety, and we don’t want to transform users into prompt engineers. Interacting with a clean UI with affordance can have much lower cognitive load than writing a thorough description of the outcome you’re looking for. This article by Davis Treybig is great and describes all of these design challenges in detail.

The ambiguity trap II  – how LLMs respond to instructions

Downstream applications relying on LLMs expect outputs that conform to a particular structure for effective parsing. While we can tailor our prompts to explicitly outline the desired output format, this is never guaranteed. We are seeing this with question answering. We prompt the model not to return an answer when it isn’t sure of one, but as a prolific chatter, sometimes it will still respond with an “I don’t know” type of answer. How do we deal with this in the UI? The application’s frontend is not able to tell the difference between this answer and a “valid” one, with meaningful content.

Despite this being a nuisance, to someone using a helper tool there’s one thing that is worse than not getting an answer, which is getting a wrong one. This is not a new problem as poor search engines can also produce noisy, non-relevant results with high confidence. But dealing with hallucination is a whole different challenge. Conservative prompt engineering can be effective at tackling this problem, but in the end we can never be 100% sure that the model will comply with instructions.

It’s still unclear to me how we can deal with the lack of predictability. For now, the only feasible option seems to be through prompt engineering. We need to make it a systematic task, with version control – essentially yet another function that needs to be integrated in the product development lifecycle. 

Working with context is hard

Building a question answering solution for business requires informing the LLM of the knowledge context of each customer. Answers need to be strictly based on company vetted data. However, LLMs have context windows, that is, they limit the amount of tokens the model can access when generating responses. Some businesses can have hundreds of thousands of documents with relevant information. We use embeddings to measure content relatedness and select the correct data snippets – ultimately, the success of this operation will dictate the overall success of the feature. It’s just good old search – if that doesn’t work, LLMs won’t save you when it comes to knowledge retrieval.

Cost estimation: mission impossible

The more context you give the model, the better the performance – or at least that’s what we hope for. However, this will also increase latency… and costs – OpenAI charges for both input and output tokens. If we factor in the “natural” output unpredictability, customer variations when it comes to context and constant prompt improvements, making a sound prediction is a difficult task. Also, the LLM world is moving so fast that any prediction is bound to become outdated quickly – hopefully the trend is for cost to continue to go down as competition grows.

Conclusion

“It was the best of times, it was the worst of times” – said Dickens on LLMs. 

It’s impossible not to be excited with the quantum leap of conversational abilities by machines. You can have so much fun experimenting with it, and demos can be mind blowing. However if an incredible demo isn’t followed by actual usage – and more, by a productivity gain (or at least, a super pleasant user experience) – the result will sooner or later be churn. And the market seems to be moving fast from the initial excitement phase to converge in a handful of low risk use cases, particularly in the B2B space – as with any other new tech. Some people will argue that we cannot afford to be conservative – I argue that we cannot afford to ship products that will not solve the customer’s job. Be bold, but learn fast! 

Bismarckian Product Management

At first glance, it might be hard to draw parallels between modern product management and German reunification. But I’m a product manager with an International Relations background, so that’s what you’ll get. 

Otto von Bismarck, the diplomatic mastermind behind the reunification feat, is famously credited with the quote Politics is the art of the possible. It reflects his pragmatic approach to governing and his understanding that compromise is the key to success. He’s one of the most influential figures of European history not because he wanted to change the world, but because he focused on achieving practical and attainable goals in a complex environment; not because he fought against constraints, but because he knew and accepted them. 

Perhaps you can already see where this is going. The world of tech is full of idealistic aspirations, particularly in moments of disruption such as the one we’re experiencing with AI and Large Language Models. But at the end of the day, as we sit back and marvel at technological progress, product management too is the art of the possible. The reason we exist as professionals is because there is a need to establish a complex balance between existing resources, market conditions and customer needs. It’s not an armed revolution – it’s realpolitik in favor of the customer’s interest, a constant search for the best feasible solution within existing constraints.

Idealizing an innovative product takes creativity. Idealizing an innovative product that people need and love takes cognitive empathy. Actually delivering it takes all of this plus a healthy dose of pragmatism. To me, great product leaders stand at this intersection of skills. This might exclude some of our most famous dreamers, but includes many distinguished unknowns who make our lives better everyday by delivering progress in small but significant increments.

Put the “V” in MVP

One of the most painful things to witness in the tech industry is the constant misinterpretation, or even distortion, of Agile or Lean concepts, and Minimum Viable Product (MVP) is a blatant example.

Eric Ries described it as “[the] version of a new product which allows a team to collect the maximum amount of validated learning about customers with the least effort”. Notice that the description starts with its purpose – however this is commonly disregarded by product makers who just focus on the effort aspect. The concept, which reflects a more than reasonable concern with waste, is used to justify poor design decisions, insufficient resource allocation and an overall lack of vision and strategy. 

Context is key: from Ries’ description, it seems that this is a strategy to validate market fit, and I would argue that it is particularly useful when there is a high degree of innovation involved. Nevertheless I’ve seen it being used countless times to describe long standing backlog items – often in supposedly mature products – that aim to solve validated customer needs.

“Let’s just make an MVP” normally means “we have loads of stuff to do and we can’t prioritize properly or negotiate commitments, so let’s just make a poor man’s version of this solution”. User experience is the first victim – it just needs to work, it doesn’t matter if you need 10 consultants just to plug it in, or if customers are constantly raising support tickets because they don’t know how to operate it. Reliability is second – not to mention security.

Where is the viability in this? 

Imagine if the folks in the Toyota factory back in the 40s did the same – to save time, they would start shipping cars that were unreliable, unsafe, and that no one knew how to drive. Would this be considered a major success in efficiency? Or would it hurt their credibility to death?

Trapped in a chamber: sprint review vs reality

Nothing beats the feeling of presenting a brand new, beautiful and functional user interface for the first time – except for the realization that there is no plan to actually make it available for real users.

Tech companies adopted Scrum as a way to deliver on the Agile promise of fast, incremental value. There is a clear product goal that everyone in the team strives to accomplish during the designated sprint time. When working software is delivered frequently, everyone is happy. The sprint review is a success story, progress is praised, people engage in self-congratulating behavior that keeps morale high. It feels like being in a cozy room with padded walls covered in golden glitter – we did it folks, it works! 

But who is paying for the golden walls? The first sentence of the Agile manifesto contains the answer – our highest priority is to satisfy the customer. However, on many occasions, our beautiful working software is not actually available to those who can take advantage of it for purposes other than self-gratification, for multiple reasons:

  • It’s shielded by a complex provisioning layer that requires the intervention of professional services teams
  • It lacks discoverability, meaning that although users have access to it, they will never find it (unless someone guides them)
  • It replaces another existing product or feature and there is no migration plan
  • It’s not scalable to the point that you can actually roll it out to a user base that is large enough to provide relevant feedback
  • More generally, it lacks 3 out of the 4 marketing Ps – the product works, but there is no price (how does it fit our pricing strategy?), no promotion (how is it going to be communicated?) and no place (how is going to be distributed among consumers – or users)

For some reason, this is rarely a cause of concern for scrum team members – and I am not just referring to product owners (or product managers, considered a broader function), although that sort of attitude is even harder to process when coming from folks in those roles. 

Is it because we think this is always someone else’s responsibility? Are we afraid to break the glass into the outside world, where people (aka customers) are not always super nice and friendly? Do we actually enjoy building products in a vacuum? So many questions, so little time.

Then there comes a time when the golden, padded walls start to crumble. Everyone starts to realize that there is no actual adoption, developers complain to their managers about the lack of transparency and usage data. 

When management decides to shield teams from the brutality of early stage feedback and adoption numbers, they are only buying motivation for the short term. Eventually people will start questioning why the hell are they committing to deliver something within two or three weeks if real users will only be able to get their hands on the prize in half a year or more (assuming things won’t be ditched before they even get to that stage, due to strategy shifts). So let’s be honest, as this is the only way to build trust. And let’s work together to bridge the gap.

Three months in as a PM

Building up the confidence to change

They say it’s not the destination, it’s the journey. And whilst that is often not true for actual traveling (who loves a 12 hour flight?), it is definitely the way I like to look at my professional path, and this comes with a feeling of pride and a sensation of chaos.

I wrote before about being a generalist and how it is both wonderful and disturbing. When I started working in the SaaS industry 8 years ago as an account manager, I saw it as a good opportunity to be able to pay the tuition fees for my Master’s degree in Economics (I was writing a thesis on wine exports). Now I’m managing a product in a hyper-growth company. In the meantime I’ve lived in two different countries and learned a few lessons on leadership, mostly by failing at it. It’s been an interesting ride, only possible by constantly seeking discomfort and calculating risk. Unless you were born to be a risk-taker and/or have a really great safety net, striking a balance between these two factors is essential.

Being a young girl with a diploma from an Arts school, it’s hard to build up your confidence levels in a male-dominated industry where engineering talent is the market’s most valuable asset. To be honest, it’s hard to feel confident in any context. Also, when you finally get there, you have to face how other people struggle to deal with it. People tend to treat confidence as a personal trait, but like with any other skill, you can work to develop it. To me, the best way to train confidence is through learning, either horizontally (learning something completely new) or vertically (becoming really good at something you already know), but with a strong preference for the first option. There are a couple of things that help with that:

Being mentored by smart people

You can’t push for this – it’s a matter of luck. When I say smart I don’t necessarily mean booksmart (although – unpopular opinion these days – I find that helps), but someone who is capable of having a comprehensive world/market/company vision that is not tied to common biases and assumptions ranging from gender, age, education, family background (…) to your haircut and tone of voice. Someone who values your curiosity, your effort, your ability to organize thoughts and speech, who you feel comfortable talking to and learning from. I don’t like the expression role model as it doesn’t do a very good service to individuality, but someone who respects and protects your intellectual integrity. This is the only way you’ll feel safe asking questions.

Being open to learn autonomously

Indeed, asking questions may be the best way to learn and improve, but the first person you should be asking questions to is yourself. Research. Use technology to your advantage. Read. Try to come up with your own solution to the problem. And then go to others for validation – if you get it, you’ll feel incredible for having achieved that on your own. If you don’t, think about what needs to be done to improve that process for others: see if there’s a gap in an internal knowledge management platform that you can work on, if there’s some training that should have happened during onboarding. Turn your frustration into someone else’s accomplishment. Contribution makes everyone feel better.

Settling in

So the goal for me was to learn a new trade in a new company – switching from a leadership to an associate position – whilst working 100% remotely and raising a one-year-old in the midst of a pandemic.

This is fine mask | Google, Des trucs, Idée dessin

Here are some the key learnings so far:

Users are your source of truth, not the Jira board

If you’re looking for a source of truth about product status, don’t trust the Jira board or sprint reviews – trust users. I also learnt that it is possible to be constantly in touch with your customer and yet not know how the product performs in real usage scenarios. This happens because, in B2B particularly, the person who buys your software is not necessarily the same who’s going to use it.

I don’t think you need to eat your own dogfood. User research offers a variety of techniques that will bring you closer to product usage reality. Just don’t settle with second-hand feedback, as it can be deceiving.

Observe with empathy, react with reason

It’s normal to feel bad when you see something you helped build fail its purpose. And yet this is a very likely outcome of user research. Being able to feel more empathy towards users than pride over your own choices and team work is a lesson I learned the hard way while working in customer success, providing mission-critical software for large enterprise. Being proud of your tech doesn’t pay your salary – happy customers do.

The customer is not a moron, she’s your wife. If it’s not working for them, don’t patronize – save some time to try to get to the root of the problem and come up with a fix. The fix could be a minor tech improvement or a completely new product vision – embrace change without seeing it as a personal or even a corporate failure. However, I think proportionality is also key, because trying to reinvent the wheel every 2 weeks is not sustainable.

Gain access and control

I’d argue that while you don’t need to be your product’s heaviest user, it is a good idea to check the state of the art every time something is deployed to production, even if it’s not a major feature. It’s not a matter of not trusting the QA process – it’s just that there are always tiny interaction details that you missed in the design stage. Iterating over mockups makes me nervous.

I’d also like to add a comment about feature flags. They are great to guarantee smooth feature rollouts to customers, but I found it crucial to have direct control over them: as a PM, I need the flexibility to enable or disable features without disrupting the development cycle. I needed the freedom to decide when to beta test a feature with a customer, or just to enable stuff for testing in internal production accounts.

Don’t fall into the Scrum trap

Scrum provides a framework to organize product development, not product management. It doesn’t offer a lot of guidance when it comes to discovering new features, prioritizing, aligning those priorities with management, making sure your solutions are tightly integrated with the product’s ecosystem, communicating new features to your stakeholders, choosing the best methods for user research, … the list goes on. It’s easy to fall into the trap of organizing our work around ceremonies and ceremony preparation. I adhere to Agile principles but dread the ritualistic vibe it brings to team work. A framework (any framework) is supposed to make your job easier, not define it (this one goes out to all of you hiring product owners).

Build and communicate a vision

Another Agile trap, which I believe comes from a misinterpretation of its main principles, is that long-term planning and vision statements are no longer useful. Value is more easily delivered if you adopt short cycles and constant iteration. I don’t argue with this, but I do argue that this is not incompatible with having a long-term vision for a product. MVPs deliver value, but they don’t inspire people. Having that inspirational story to tell is helpful to gain internal alignment, to deliver marketing messaging, to motivate. A vision is not a one-time commitment – it’s a compass for everyday decisions and is constantly evolving.

Go for that swim

On a final note – I live by the beach in the north of Portugal, where sea temperatures range between 15-17ºC in summer months. It’s beautiful and cold. I stand for long minutes with my feet on the wet sand getting mentally ready to take a plunge. Then I look to the side and there’s a group of kids just playing around in the water, completely indifferent to the fact that it’s 20ºC below their body temperature. I feel silly and old. But I’m not, so I go for it.

The same goes for the job. I’m overly conscious of my limitations, and that frustrates me. But then I realise I’m just a kid, I can learn anything. So I just go for that swim.