I think what you want for this is more likely a REST API transform. You'll POST (rather than PUT) your document to /v1/documents?directory=/content/&extension=json&transform=verify -- assuming that you have created a transform called "verify". MarkLogic will then take care of generating a unique URI for you.
The transform can do whatever error checking you need and throw errors with appropriate HTTP error codes as needed. The transform runs before the document is inserted into the database, so throwing an error will prevent the insert.
Looks like you've already found this, but for other's benefit, Roxy provides scaffolding for REST extensions and REST transformations.
Do note that even though you're sending JSON, the transform will the XML representation of that JSON, which is how it will be stored internally.
EDIT
Responding to the later part of your questions...
Q: Is there a CURL command to test this transformation? A: Once you've installed the transform ("ml local deploy modules" with Roxy), you'll use a CURL command like how the transformation will really be used: POST to /v1/documents?...&transform=check. I like to use the POSTman Chrome plugin for that kind of testing; I believe there are similar tools for other browsers.
Q: And what to do with the 'then' part here, do I refer to another script for the handling of the file or do I include insert-document code here so this transformation is in place of an extension? A: Note the first bullet under Guidelines for Writing Transforms: "transforms typically should not have side-effects". There are exceptions of course, but remember that by posting to /v1/documents, the normal course is for the document to be inserted into the database. With your transform in place, the content returned from your transform is what will be inserted. Thus if the "then" part signifies that the document is good, just do "then $input". See the example of an ingestion transform. You don't need to do an insert-document.
Suppose, however, that you wanted to do some more sophisticated checking. You could pass the input node to a function defined in a library module, which would take $input as a parameter. Your function might then throw exceptions if there were errors, or even modify the input document if correctable problems are found. Your transform return the results of that function call. The nice thing about putting that functionality into a separate library module (which you can import normally) is that the functionality would be better isolated for testing.