Hacker Newsnew | past | comments | ask | show | jobs | submit | gbear0's commentslogin

Those checked-in specs become the requirements for the system. So the next time you ask the AI to make a fix, it can use those specs as part of the solution and not break another requirement. Basically the code underneath keeps getting rewritten over and over, but that doesn't matter as long as it hits the required specs.

Do you rewrite the specs with new requirement changes if they've already been implemented? How do you supercede a spec?

I've been using LLMs daily and I spun up a few spec driven flows once or twice but like the person above I think the code is the source of truth.

Also why wouldn't you use TDD to enforce the 'spec' then?


Why was it a maintenance dead end? It sounds like you were able to iteratively work on it in its current state, but are you going to be the one maintaining the code?

I keep asking myself the same questions, and the conclusion I keep coming to is the clean modeled structure we want to see is for humans to maintain and extend, but the AI doesn't need this.

There's definitely an efficiency angle here where it's faster for AI to go from a clean modeled solution to the desired solution because it's likely been trained on cleaner code. Is this really going to matter though?

The best argument I can come up with is the clean modeled solution is better for existing development tools because it's less likely to get confused by the patch work of vibes throughout the code; but this feels like it ultimately becomes an efficiency concern as well.

This just might be the new reality, and we need to stop looking behind the curtain and accept what the wizard presents us.


> the clean modeled structure we want to see is for humans to maintain and extend, but the AI doesn't need this.

This does not match my experience. I do a lot of AI-assisted coding at this point, and what I've seen is that when the AI is asked to extend or modify existing code, it does a much better job on clean, well-structured and well-abstracted code.

I think the reason is simple, and tracks for humans as well: well-structured code is simply easier to understand and reason about, and takes a smaller amount of working-set memory. Even as LLMs get better with coding, I expect that they would converge on the same conclusion, namely that good structure + good abstractions make for code that is more efficient to work with.


Yeah I have had claude take over multiple internal (human written) projects that were in a dire state and spent a week just completely refactoring them and adding exhaustive tests before doing any new features. It's worth starting from a clean slate.

I keep hearing the assertion that you can’t make high quality, maintainable code with LLMs. The last two years using AI have shown me exactly the opposite.

I think it’s all about the structure you use to work in and how you use the model. We are shipping better, more human friendly code, with less bugs, then we ever did before and doing it at 1/10 the cost before LLMs.

But we are definitely not vibe coding, and the key seems to be devs with years of experience managing teams, managing the LLM instead. Basically you create the same kind of formal specifications, conventions, and documentation that you would develop for a project with two or three teams, then use that to keep the project on the rails recursively looping back through the docs as you go along. I’ve only had to back out of a couple of issues over the last year, and even though that cost a couple of hours, it was still extremely cheap.

Meanwhile we are shipping at 4x speed with 1/4 the labor, and the code is better than it was because the “overhead” of writing maintainable, self documented code has inverted into the secret ingredient to shipping bug free code at unprecedented speed.

If you just explain the standards to which you want the code written, use a strict style guide, have a separate process that ensures test coverage (not in the same context) you can get example quality code all the way through. Turns out that’s also in the training data.


Many of us recognize that the days of nearly-free tokens is quickly drawing to a close, and at some point humans may very well have to dig their keyboards out of cold storage and return once again to the code mines.

OAI and Anthropic need to generate cash flows from operations - once they go public that’s it. Any future funding for reinvestment has to come from internal funds beyond existing raising + IPO.

So yeah, it’s imminent. Let’s see how demand shifts in response in the future.


June 1st for all the folks taking advantage of copilot. It was an astounding deal and a lot of people were “abusing” it.

The reason why you will never get software engineers (in companies) to accept the man behind the curtain is liability. If a human software engineer is still responsible for what happens when the AI developed code has a catastrophic bug or security vulnerability, then the only way for the human to know if there is a problem is to be able to read through the code or run it through some <insert advanced formal verification tool here> that guarantees zero issues.

I think we eventually end up at the tool approach via vendors providing the tools to other companies, but it still feels like there's a long road ahead to get there.


> but the AI doesn't need this

That's not true. The LLM performance will degrade as the codebase gets messier as well. You get to a point where every fix breaks something else and you can't really make forward progress.

Yes, you might be able to get a bit further with a messy codebase just because the LLM won't complain and will just grind through fixing things, but eventually it will just start disabling failing tests instead of actually fixing things.


The token cost to fix might surpass what a human would cost to just do it.

> Why was it a maintenance dead end?

LLMs have a limit to how deep they can understand and refactor architectural issues.

That limit is far, far lower than a human's.


>This just might be the new reality, and we need to stop looking behind the curtain and accept what the wizard presents us.

This is how societies become shittier. People who are ostensibly responsible for doing their jobs not giving a damn about quality.


Sometimes I think the main value in AI-maintained code being “high quality” is when the structure can enforce invariants. If invalid states aren’t representable, then the AI can’t easily add bugs in the future.

Of course that just leads to: what’s the best way to achieve that goal? Through elegant code or adding lots of tests? Which is a debate from long before LLMs existed.


I've had a similar desire for a code modelling system for decades, so I've given it A LOT of thought, and there has been a lot of older research into Zoomable UIs and Semantic Zoom. Code Bubbles (https://learn.microsoft.com/en-us/shows/alm-summit-2011/code...) is the closest I've seen to the idea, but doesn't cover the scope I want.

Biggest challenge to me is the UX and navigating the relationships between entities (systems, components/modules, classes, functions, read/write memory, etc) requires a lot of design effort around how they work together consistently at all levels. Conceptually, your view is a set of boxes that are a filter/group-by over a lot of entities at some level, and you want to explode only some of those entities. eg. say you want to zoom into a micro-service's component level, but still see external APIs, which could be a single box per API or boxes for each endpoint. So the control you need over the way zooming works and the 'lens' over relationships filter/group-bys can easily become very complex; probably a good research project itself though!

I do think it's possible to build a good interface that would allow viewing from global cloud scale systems and right into the code through multiple paths, like design patterns/components or git repos with files/folders, but I'm not sure how nice it's going to be to use. There's a reason UML modelling didn't stick around. And I'm not sure there's enough of a business case to fund it, but I'll definitely keep hoping to see it some day.


On the idea of interpreting the weights, I've been very interested if it's possible to compute basis vectors of the weights matrix to define the core concepts within the model and then do a change of basis to allow reorganizing the model to more human understood concepts?

I think the inherent compression of a specific training set into a matrix makes this more difficult cause the basis vectors likely won't contain clean representations of human ideas, but I also wonder if starting a new training set with an initialized (or fixed) matrix of human defined concepts would help align the model's weights to something that can be interpretable


Things get obfuscated because someone's viewing the problems from a different abstraction lens, and they're building a system onto that lens.

Eg. Iterate through an array:

  const arr = [1, 2, 3];
  for (let i = 0, l = arr.length; i < l; ++i) { console.log(arr[i]) }
Let's model it differently using an iterator:

  const arr = [1, 2, 3];
  const arrIter = arr[Symbol.iterator]();
  let i = arrIter.next();
  while (!i.done) {
    console.log(i.value);
    i = arrIter.next();
  }
At this level it's still pretty obvious what's going on, but you can still see that there's a level of abstraction between an array access vs calling 'next/value', and that obfuscates what is actually happening at the computation/instruction level.

If I extend this another level then I'm going to start modelling problems using an iterable and not an array/index. New requirements come in and we extend to use an async iterable. Everything still works nicely, but in some scenarios where the actual iterable is just an array, now there's a lot of extra overhead to just do an index lookup.

Using the iterator allows the code to be reused in more scenarios, but there's usually a cost to switching the lens of abstraction so that it fits into a problems modeled differently.


Has anyone tried to add special clauses to their initial contract along the lines of: "If the offer is rescinded before the start date, the applicant will be awarded 3 months salary. If the applicant is fired within 3 months of the start date, the applicant will be awarded relocation fees to previous location." And you'd update that based on costs of moving, or possibly the state of the economy and your risk tolerance. If they don't agree to that, then they clearly aren't serious about hiring you, and you dodge a bullet.

This would hopefully reduce cases where they hand out multiple offers for a single job, or at least compensate applicants due to a change from management.


I really think there needs to be a rework of contract law when it comes to idea of that is permissible and what is considered an unconscionable contract

For example, most relocation agreements I have seen have a stipulation where by if the employee quits with in a year or 6mos or some defined time period the employee must pay back any relocation costs incurred by the company. There however is no reverse stipulation. That should make the contract unconscionable.


I suspect that contract / tort law would cover this situation already in many jurisdictions. There's a contract to hire the OP and I'm guessing that contract didn't have language about rescinding the contract (my Amazon contract did not), only about being an employmee-at-will. Such a clause would probably allow an employer to fire the person once they are hired, but likely wouldn't let them get out of a contract to hire the person first. If I were the OP I would talk to a lawyer and get them to negotiate payment of all actual damages rather than the 1 month pay offer on the basis that this is possibly either fraud or breach of contract. I am not a lawyer, but I LARP as one sometimes.


Unless you are a special snowflake with a unique skillset, why would a company the size of Amazon ever veer from their standard contract?


I've negotiated with some very large companies at times.

No one can operate a company as a single giant monolith. They're all subdivided into departments or subcompanies or subunits in some way or other. And these subunits often have more or less leeway and authority to act by themselves (depending on the exact organization).

My favorite story is when the manager across from me happened to have signing authority for the relevant subunit. So he just grabbed a sheet of A4 out of the printer, scribbled down what we agreed upon, put his signature under it, and handed it to his secretary "Please file this in Kim's file". Didn't even blink. <5 minutes. Legally binding written contract.

Another time I made a cultural mistake, I actually gave a Very Large customer my best offer upfront (negotiating as a small company owner this time). I should have tacked on 20% or so, just to give the manager something to negotiate over. Lesson learned: Big company managers aren't just able to negotiate; some actually almost feel cheated if you don't give them anything to negotiate over! ;-)


Now, I could negotiate to an extent with a smaller company where I would be a “special snowflake”. If they were looking for someone with my combination of skills.

But the company changing events I did at the smaller company I worked at from 2018-2029, gets me an attaboy for adding 5% to my divisions revenue, a message on our Slack channel and life goes on.


Why are they hiring people that need a visa and need to relocate across the world?

The ask for compensation for getting dropped before the start date should usually be an easy one for a company to accept since that shouldn't be normal operating procedures. Otherwise, they're aware that they're explicitly screwing over people and aren't serious about hiring. Now we're back at my earlier claim, and you dodged a bullet.

You don't need to be special, but you definitely shouldn't think of yourself as a cog in the machine. If you do think of yourself as a cog, or this is an offer you can't refuse, then you've already given up your bargaining power, and you are willing to accept this.


You are a cog in the machine with a company that has over a 1 and a half million employees.

Andy Jassy is my skip x 7 manager and he doesn’t know me from Adam. I doubt my skip x 3 manager would know me if he ran into on the street.

How much bargaining power do you think you really have? If any of us got hit by a bus, they would have an open req out before our body’s were in the ground and only be remembered when our name came up during a git blame.


I've always wanted to see a swarm of miniature roomba drones that can automatically clean/collect dust from anywhere in the house.


I think my new favourite way of managing runbooks is to actually build them into a file tree of a bunch of simple python subcommand scripts, and have a run.sh script that scans the file system and uses argparse to construct a cli to call each script.

  # call ./runbooks/stack/update_secret.py
  # could update a secret in a vault, or update it in your deployed app
  ./run.sh stack update_secret --env=dev --name=foo --file=secret.txt
Most of the time my python scripts are glorified CLI commands like `docker service update` that are called through subprocess, so you shouldn't need to install dependencies beyond what you'd be typing in the CLI. It's also easy to add a verbose option to print out the commands it runs so you can do it manually.

  # call ./runbooks/services/build.py
  ./run.sh services build -v
  > #--- Building images ---
  > #> DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 --label "myapp" -t example-admin-ui:local "./admin-ui"
  > #> DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 --label "myapp" -t example-frontend:local "./front-end"
  > #> DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 --label "myapp" -t example-nginx:local "./nginx"

Anything that can't be automated prints out an input line that gives instructions on what to do and just waits for you to input "yes/no"

  # call ./runbooks/get_crash_report.py
  ./run.sh get_crash_report --out=./crashes/
  > # Copying crashes from AWS to './crashes/
  > # Manual Step: Fill out crashes spreadsheet: docs.google/example_sheet
  > Continue [y/n]? 
  
The other really nice thing with this setup is the run.sh script is able to build up --help commands that can print out what actions are available and what params they use cause it's just python argparse. Makes discovery of what to do or looking up params really quick.

At this point, the only culture you need to build is one where everyone's supposed to use the run.sh scripts and not do things manually. This enforces people to fix the scripts when something changes.

YMMV, but I've found this has simplified a lot of processes for myself at least.


I assume each service has its own health check that checks the service is accessible from an internal location, thus most are green. However, when Service A requires Service B to do work, but Service B is down, a simple access check on Service A clearly doesn't give a good representation of uptime.

So what's a good health check actually report these days? Is it just about its own status, or should it include a breakdown of the status of external dependencies as part of its folded up status?


I don't see much of an issue with any company having lots of entry points into different industries. I see more of the problem being tight control over vertically integrated aspects of the supply chain.

I'd prefer to see rules limiting companies from creating products in 2 consecutive/complementary components of a supply chain unless one of the products is completely open for 'swapping' out with something else. This allows companies to still benefit from vertical integration if they don't commercialize one side of things. But as you said, it's hard to draw the lines, cause you could define a supply chain in a ton of different ways.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: