Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: We made reverse image search for NFTs to identify fraud and forgery (fnftf.io)
1 point by kkielhofner on May 18, 2022 | hide | past | favorite | 9 comments
Hey HN!

We just launched[0] our NFT search engine at https://fnftf.io.

NFT frauds, copies, forgeries, etc are rampant. As one example check out a search[1] for a random Bored Ape that shows about a dozen visually indistinguishable copies of the Ape. You can also expand the results to see the various ummmm, “remixes” of the image submitted for search (and play with various filters, etc).

With that out of the way here’s how we built FNFTF:

- Blockchain... We run our own node infrastructure because the various node providers get really expensive really quickly when you need to index chains like we do. We have a custom Node backend written in TypeScript that handles this indexing (both historical and realtime) with HTTP and WebSocket connections to our various nodes. I can spend a lot of time talking about how challenging this first step is…

- We fetched and cataloged all of the on chain data and the metadata and actual content (wherever it may be). We do this in realtime as new NFTs are minted, sold, whatever. This has plenty of challenges too.

- We add the media content (currently all image and video formats) to our database via our perceptual hashing implementation.

- Search and comparison is the tricky one... Every so often we build an approximate nearest neighbors index for all of the content in the perceptual hashing database. This is then loaded in memory.

- The actual search comes in multiple passes. We first take submitted content and generate an abbreviated perceptual hash for it. We search the ANN index to get a first pass of results using various standard distance approaches. We then filter that first pass through higher resolution perceptual hashes to increasingly filter the results and generate distance scores for percentage of content match scoring.

- The backend for the hash and search steps is python powered by FastAPI.

- The API frontend is a Cloudflare worker in Bundled mode. We currently use about 6ms of CPU time so we have plenty of room there.

- The fnftf.io page is Next with a lot of React components generated statically and served via Cloudflare Pages

- Speaking of Cloudflare, we use Workers to fetch the image results from our backend storage to reisze, re-compress, and add our watermark. This is crucial because we want to provide result images but we definitely don’t want to further enable scammers.

- We cache search results in a CF Workers KV store for speedier follow-up searches and to enable search sharing on social media, etc. In terms of caching it’s not terribly effective because it relies on matching a hash of the search but it’s good for the sharing aspect.

- Our browser extension[2] is absolute bare bones and enables two click searches directly from about a dozen NFT marketplaces. All it does is get the image URL and launch a new tab to fnftf.io with the image URL as a parameter. Then we fetch that URL and do our thing.

All in we have about 40TB of data, growing by the day as we index new content and add blockchains.

I’m the sole founder at Tovera and only full time employee. I’d love to hear what the HN community thinks about FNFTF.io!

[0] https://www.producthunt.com/posts/fight-nft-fraud

[2] https://fnftf.io/?results=00722cd4e7124f8aab052e31b14e301e37...

[2] https://chrome.google.com/webstore/detail/tovera/gcghgjemlna...



> We never save the images you submit because we know stealing is wrong.

Is this satire or no? I seriously can't tell, this reads like something from a 90's hacker zine astroturf article directed at Unix vendors...


Hah, no not satire. Believe me - I felt like a 6yr old writing that sentence but with all of the shady and criminal activity in the space I wanted to be as straightforward and emphatic as possible that we aren't doing all of this as some sort of ruse to steal content.


Your tool detects 'fraud and forgery' of something that no-one actually owns? How exactly does that work?


For the creator of the work they absolutely own the copyright of the work. For them FNFTF can show them copies or unauthorized derivative works.

For any other use the situation is more complicated. Some NFTs confer ownership and/or use rights.

There's additional value with FNFTF in that users who are thinking of purchasing an NFT can have confidence they're not buying a copy that may not have the value, confer the rights, or provide some additional benefit they think it might.

Like any other counterfeit good, as we speak someone is buying an NFT as (if nothing else) an asset class and paying what they think it is worth. If someone is spending tens or hundreds of thousands of dollars on an Ape (for example) they may find out down the line that it's a copy and no one considers it to be worth anything. This isn't an different from forged works that have been sold for all of human history.

Then there are marketplaces and other uses. Marketplaces can use the underlying API powering FNFTF to prevent copied content from ending up on their platform, removal of fraudulent/inauthentic content, etc.


So if I have this right, NFTs are a way to hand over a wad of money to get a receipt back that demonstrates proof of ownership of something that isn't in your possession and which can be deleted/modified/copied at any time, and your tool is a way for buyers to have increased confidence that the thing which can be deleted/modified/copied at any time is in fact the original? I think I'm too old for this nonsense.


You're really close, actually! NFTs are a way for you to get a ledger that attests your ownership of the item by some arbitrary group. This arbitrary group is not respected nor recognized by any law to be a standard (nor is it flawless), and ultimately gives you no authoritative power over the object except for limited scenarios where said arbitrary group faithfully administrates your marketplace. Oh, but there's the fact that nobody will operate faithfully in the face of turmoil, and coins like $LUNA will backstab their top 1% of users to stop the coin from collapsing.

None of these are safe or sane. I think all of us are too old for this nonsense.


How do you handle situations where the NFT is the image violating copyrights of another?


For NFTs that are copies of other NFTs we go with the general idea that the oldest (first) similar result is likely the original. This is an area where the blockchain as a fundamental time stamped ledger is very useful.

In cases where an NFT is stolen from existing web content (happens A LOT) our version of the API (with auth) integrates search results from the Tineye API. These searches are relatively expensive so with the initial release we put them behind an API key. That may change.

In terms of copyright violations... We've talked to a lot of lawyers in the space and it breaks down into two broad categories:

1) Straight up infringement. This is anything with a (roughly) 94% or greater pixel match. In terms of copyright this is an open and shut case for infringement.

2) Some kind of derivative work... Whether a well intentioned remix of another work or a deliberate attempt to avoid detection or cause confusion we haven't yet received hard legal guidance on where this line falls algorithmically.

3) A new work that's visually similar. Again, not sure what the computer vision approach is here.

In any case it comes down to "What can you do about it?". First steps are making sure people don't buy it and creators are aware of it. Next steps (working on this!) are getting this content off NFT marketplaces. That can come either through cooperation with them or automatically serving DNCA take downs to content hosts (problematic with IPFS, etc).


opensea seems to be doing this internally already. is this tool necessary if most exchanges will flag copycats for you?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: