More

nehagetschoice · on March 5, 2024

Yep - the AI-generated copies in the first run are pretty generic - that's not the value add (sorry if the demo made it appear that way). The key value adds are (1) Quickly running an A/B test with no code (2) Running inferences based on the test data and feeding it back to the model to generate better copies in the next iteration(shown in the demo video)

vintagedave · on March 5, 2024

I understand better now -- and I was focusing on the AI aspect because it's an interest. But I get the value prop and thanks for taking the time to explain :)

nehagetschoice · on March 5, 2024

This is cool. We definitely spent weeks sitting in customer interviews, then parsing information and patterns from those into meaningful insights, and sometimes that's just so hard to do, that it wouldn't even be concrete enough to fundamentally change our roadmap. I'm excited to see if this tool can bring nuanced insights to research conversations.

nehagetschoice · on March 4, 2024

We either run experiments with higher traffic diverted to treatment (eg: 50% instead of 1%), or run it for a longer time. This is the reason why we need companies to have at least 2K monnthly users to run pilots with us.

earthling998 · on March 5, 2024

Thanks! Cool idea on that front and in general; fun to see people taking copy seriously!

I am still a little fuzzy on how this might play out on a real site, since lots of monthly users doesn’t always mean lots of conversions (especially for one-time, high dollar transactions).

Let’s say I have a page that gets 100k unique visitors per month. I show them 5 different variants of a “nudge” widget. Some do better than others, but they all hover at <1% CTR.

While there may be a story to tell as far as winners and losers (e.g. mobile users converted to variant A at 2x the rate of desktop users), how do you confidently report a “winner” from what amounts to a few thousand conversions in a month?

nehagetschoice · on March 4, 2024

So far, we have seen 1) Companies like stat sig that are stand alone AB test platforms 2) Customer engagement platforms like customer.io that allow you to do A/B testing on their campaigns 3) There a few website only platforms like coframe and mutiny that do not integrate with internal products like notifications and logged in pages

The things are make us stand out -

1) There is no tool afaik that looks at inferential learning based on past experiment results on copies. The experiment analysis and the continuous feedback loop is missing. 2) To make (1) happen, the first step is copy versioning and management. Without a tool that can abstract strings in a CRM, and monitor learnings over time, it's hard to make (1) possible. 3) We integrate with tier 0 services like notifications with a low latency solution that makes copy iterations possible for in-product features, outside of logged out surfaces like web pages.

Would be interested to know if you have seen software that solves for (1), (2) and (3)?

Areibman · on March 5, 2024

Makes sense. Coframe (https://coframe.ai) comes top of mind

nehagetschoice · on March 4, 2024

Interesting. One of our customers just requested this today, where they want to test form fields. Are you imagining higher form submission conversions by optimizing the number, type, and layout of the fields?

edmundsauto · on March 4, 2024

Yes ideally closing the loop on the value of those conversions as well

The pitch would be “automatic ab testing for your form submissions”. I talked to a few local lead gen companies years back and they thought it was a neat idea, I just never got around to building it.

zilian · on March 4, 2024

A quick google search gives me several tools for this : Formstack, Omnisend, Zuko Form Analytics...

edmundsauto · on March 5, 2024

AFAICT Formstack and Omnisend are marketing automation and analytics services, not automatic split testing. I got out of the marketing space a while back but I do appreciate the cross check! However it doesn't look like those products are doing exactly this.

nehagetschoice · on March 5, 2024

+1 that's my understanding as well.

edmundsauto · on March 5, 2024

I also think this could be a really cool chance to use the 1 arm bandit optimization algorithm, as it self-corrects so users don't even need data readouts.

nehagetschoice · on March 4, 2024

Yeah - two themes here -

1) Is copy even important? I think it is. If this post was titled, "Auto-tune experimentation for short-form content optimization", it might make half the audience confused about the product. In fact, the 1-liner we use for HN is very different from when we talk to VCs, because the audience is different with different goals & backgrounds. I guess the point I am making is that messaging has to be contextualized, depending on users, platform, and goals.

2) PMF vs copy - I agree that the two are orthogonal. Copy is not going to solve for the lack of a PMF (and it shouldn't). Exactly the point above - the goal is to help more and more users comprehend what you do, hopefully in a way that's more personalized to them.

apsurd · on March 4, 2024

PMF isn't orthogonal to copy if you're experimenting with copy to drive an outcome. what is the outcome then? how do you measure it? isn't the state of the art conversion?

That's the challenge: conversion funnel is complex with many factors. and largest one of them, in simple terms, is PMF.

if we measure downstream like clicks or inbound leads etc, that's more aligned with "discovery of PMF" and that's good. But it should be stacked ranked as so, it's not driving the needle. it's exploratory.

nehagetschoice · on March 4, 2024

Appreciate the candid comments and opinions here. I'll break it down and go over them - 1. Having access to data is not the problem we're after (most companies have the first-party data in-house). The key challenges are around having a platform that fundamentally separates strings (copies) from code, and lets you update them effortlessly, based on inferences from that data. So, I am not sure I understand why this is a Google/Meta product? 2. UX is not the value add from this product - agree with you on it (even if it appeared to be the emphasis). The ability to make scientific edits without re-deployments and accelerating continuous iterations based on user feedback, is what we are going after. 3. Curious why you think A/B test results are fictional? Getting stat sig results is probably the surest way to conclude results. Perhaps there's a different angle you are talking about here? 4. RE: don't A/B test at all. Given the number of users that get exposed to every change a consumer company as large as Twitter makes, not testing can be disastrous, which brings up another great point - Large companies are struggling to use all the (generic) gen AI content today, because it needs to be performance tested before it can be placed in front of millions of users, and that's not scalable today. 5. You may be alluding to another good point - copy is as much as art, as it is science, and writing it well takes context, quality, and expertise. That's something we hold a strong opinion on, and we don't see this or any other tool eliminating that expertise. The goal is very much to streamline and augment those workflows.

doctorpangloss · on March 4, 2024

> So, I am not sure I understand why this is a Google/Meta product

Which audiences am I optimizing copy for? Where do they come from? Some Google, Meta TikTok or Apple owned channel right?

Google has indexed every website. Meta has every ad. Can't they just tell me what copy to use? Why don't they? I mean, they know! They know what copy works best, for pretty much everything. They can sort by clickthrough rate, revenue due to the purchase data they have, they have everything! You talk about SMB - they know every SMB! They know your margin and your COGS and whatever because they in aggregate they observe rational spending where all the cost is eaten by marketing; they know your potential market, etc. They know all this. They don't need to run tests. They can look at very recent, weeks old, historic data, and they have way more than enough samples to answer these questions to more or less the same degree of certainty and scientific rigor that any SMB doing it themselves as a first party can do.

I mean if they wanted to, they could run the A/B tests for you! Google could "just" serve a different web page with slightly altered copy. And see if more people "click" or "convert" or whatever. They have better technology, 1,000,000x more data... Why don't they do this? You wouldn't even need UX. It could just happen, you would check a box, and they would do this.

> fundamentally separates strings (copies) from code... and lets you update them effortlessly... The ability to make scientific edits without re-deployments and accelerating continuous iterations based on user feedback...

You keep talking about UX for developers and product managers. These are UX things. It doesn't actually matter. The existence or non-existence of what you're talking about doesn't correlate to higher or lower conversions, it isn't a scientific opinion on the practice of optimization, it is just a bunch of UX patterns to achieve it, but it could be achieved in many ways, perhaps with even better UX. Like in the example I gave, where Google "just" does this for you, which is the best UX because there is no UX, you don't need to separate strings from code, and you don't need to update them, because you don't need to do anything. Google could just do this. They own the channel, they see everything, they have the technology.

So why don't Google and Meta and Apple offer an automatic optimization product? You ought to have an opinion, it can't just be, "I don't know." I mean the sort of obvious answer is that "optimization doesn't really work" instead of "three paragraphs of bullshit."

> Curious why you think A/B test results are fictional? Getting stat sig results is probably the surest way to conclude results. Perhaps there's a different angle you are talking about here.

Well one reason I am very confident they are fictional is because the people who own the channels for a decade haven't offered a tool to do this.

I mean maybe they will. Maybe it was a technology problem, but I don't believe so. You could have Markov Chained your way through 5 word long taglines and whatever. They didn't need to way for generative AI to create valid test strings for people's websites. Indeed they could just let you copy the best performing taglines they see in their systems. Why. Don't. They?!

> Given the number of users that get exposed to every change a consumer company as large as Twitter makes

Another POV is that every change they made was bad. They thought they were a product organization, and they are really a backend engineering organization, where the best decisions are all based on first principles or executives' opinions, not on some unknowable measurement about audiences.

nehagetschoice · on March 4, 2024

I'm curious to learn what kind of challenges you had in your past growth roles. It would be great to understand specific examples/workflows, and how you dealt with them.

trevoragilbert · on March 5, 2024

One of the main ones that comes to mind relates to promotions. A good example is a fintech I worked at. We wanted to advertise specific offers to specific channels of customers, but we otherwise wanted communication with them to stay the same.

For example, we'd have the same opening email sequence, same retargeting, etc. for customers across channels but we'd want to have consistent messaging around offers (ie. "$1000 off when you sign up" vs. "first month free" kind of stuff). It was tricky because we didn't want to advertise the same offer to everyone, so we wanted to carefully segment who was getting which offer and keep it the same over a window of time.

Unfortunately we didn't have a perfect solution. The closest we came to it was to have an experiment ID tied to their user account. Then we had a system where we would define the different experiments (including messaging and promotions) with each experiment having an experiment ID. It was far from perfect but worked.

nehagetschoice · on March 4, 2024

Yes, we think of this as a decision tree, where initially, copies may look different by say, user demographics and location. As we learn more about what's working well for different dimensions of users (eg: topical interests, traffic source, platform type), the decision tree grows, and every single element in the anatomy of copies is optimized based on past learnings. In an ideal world, every user truly sees a unique copy tailored to them.

iamacyborg · on March 4, 2024

How do you do that in a meaningful statistically significant way?

Not a gotcha, just genuinely curious.

nehagetschoice · on March 4, 2024

You'd be surprised how few companies actually do any systematic copy optimization - it remains ad hoc even at the big players, which is one of the reasons we started this startup.

In speaking to SMBs and large companies, our insight was that the problem of copy optimization resonates more with larger companies, as smaller companies are more focused on survival/basic marketing techniques like opening up new channels. Larger companies have already exhausted those levers, and are ready for more sophisticated optimizations.

doctorpangloss · on March 4, 2024

> You'd be surprised how few companies actually do any systematic copy optimization - it remains ad hoc even at the big players, which is one of the reasons we started this startup... In speaking to SMBs and large companies...

Did you talk to Stripe? Do they do systemic copy optimization on their landing page, or not?

mbesto · on March 4, 2024

> Larger companies have already exhausted those levers, and are ready for more sophisticated optimizations.

Sort of. Companies copy changes at larger companies because they're addressing a different audience and different set of needs for that audience.

TL;DR - concrete language / early stage ==>> abstract language / larger enterprise