Skip to content

Commit

Permalink
Give API access to FileTypeParser.detectors
Browse files Browse the repository at this point in the history
This gives the user more control to determine the sequence of detectors.

Resolves: #628
  • Loading branch information
Borewit committed Dec 15, 2024
1 parent e99257d commit f08c857
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 24 deletions.
5 changes: 4 additions & 1 deletion core.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,10 @@ This method can be handy to put in a stream pipeline, but it comes with a price.
export function fileTypeStream(webStream: AnyWebReadableStream<Uint8Array>, options?: StreamOptions): Promise<AnyWebReadableByteStreamWithFileType>;

export declare class FileTypeParser {
detectors: Iterable<Detector>;
/**
File-type detectors
*/
readonly detectors: Detector[];

constructor(options?: {customDetectors?: Iterable<Detector>; signal?: AbortSignal});

Expand Down
15 changes: 6 additions & 9 deletions core.js
Original file line number Diff line number Diff line change
Expand Up @@ -57,19 +57,18 @@ export async function fileTypeStream(webStream, options) {

export class FileTypeParser {
constructor(options) {
this.detectors = options?.customDetectors;
this.detectors = options?.customDetectors ?? [];
this.tokenizerOptions = {
abortSignal: options?.signal,
};
this.fromTokenizer = this.fromTokenizer.bind(this);
this.fromBuffer = this.fromBuffer.bind(this);
this.parse = this.parse.bind(this);
this.detectors.push(this.parse); // Assign core file-type detector
}

async fromTokenizer(tokenizer) {
const initialPosition = tokenizer.position;

for (const detector of this.detectors || []) {
// Iterate through all file-type detectors
for (const detector of this.detectors) {
const fileType = await detector(tokenizer);
if (fileType) {
return fileType;
Expand All @@ -79,8 +78,6 @@ export class FileTypeParser {
return undefined; // Cannot proceed scanning of the tokenizer is at an arbitrary position
}
}

return this.parse(tokenizer);
}

async fromBuffer(input) {
Expand Down Expand Up @@ -163,7 +160,7 @@ export class FileTypeParser {
return this.check(stringToBytes(header), options);
}

async parse(tokenizer) {
parse = async tokenizer => {
this.buffer = new Uint8Array(reasonableDetectionSizeInBytes);

// Keep reading until EOF if the file size is unknown.
Expand Down Expand Up @@ -1669,7 +1666,7 @@ export class FileTypeParser {
};
}
}
}
};

async readTiffTag(bigEndian) {
const tagId = await this.tokenizer.readToken(bigEndian ? Token.UINT16_BE : Token.UINT16_LE);
Expand Down
29 changes: 15 additions & 14 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -342,10 +342,12 @@ Returns a `Set<string>` of supported MIME types.
A custom detector is a function that allows specifying custom detection mechanisms.
An iterable of detectors can be provided via the `fileTypeOptions` argument for the `FileTypeParser` constructor.
An array of detectors can be provided via the `fileTypeOptions` argument for the `FileTypeParser` constructor.
In Node.js, you should use `NodeFileTypeParser`, which extends `FileTypeParser` and provides access to Node.js specific functions.
The detectors are called before the default detections in the provided order.
Detectors can be added via the constructor options, or by adding it directly to `FileTypeParser.detectors`.
The detectors provided via the constructor options, are called before the default detectors are called.
Custom detectors can be used to add new `FileTypeResults` or to modify return behaviour of existing `FileTypeResult` detections.
Expand All @@ -361,23 +363,22 @@ Example detector array which can be extended and provided to each public method
```js
import {FileTypeParser} from 'file-type'; // or `NodeFileTypeParser` in Node.js

const customDetectors = [
async tokenizer => {
const unicornHeader = [85, 78, 73, 67, 79, 82, 78]; // 'UNICORN' as decimal string
const customDetector = async tokenizer => {
const unicornHeader = [85, 78, 73, 67, 79, 82, 78]; // 'UNICORN' as decimal string

const buffer = new Uint8Array(7);
await tokenizer.peekBuffer(buffer, {length: unicornHeader.length, mayBeLess: true});
const buffer = new Uint8Array(7);
await tokenizer.peekBuffer(buffer, {length: unicornHeader.length, mayBeLess: true});

if (unicornHeader.every((value, index) => value === buffer[index])) {
return {ext: 'unicorn', mime: 'application/unicorn'};
}
if (unicornHeader.every((value, index) => value === buffer[index])) {
return {ext: 'unicorn', mime: 'application/unicorn'};
}

return undefined;
},
];
return undefined;
};

const buffer = new Uint8Array(new TextEncoder().encode('UNICORN'));
const parser = new FileTypeParser({customDetectors}); // `NodeFileTypeParser({customDetectors})` in Node.js
const parser = new FileTypeParser(); // `NodeFileTypeParser({customDetectors})` in Node.js
parser.detectors.unshift(customDetector); // Make customDetector the first detector
const fileType = await parser.fromBuffer(buffer);
console.log(fileType);
```
Expand Down

0 comments on commit f08c857

Please sign in to comment.