ArtifactBuilder

API reference for ArtifactBuilder, Artifact, and ArtifactMetadata in furiosa_llm.

ArtifactBuilder

classArtifactBuilder

The artifact builder to use in the Furiosa LLM.

paramself
parammodel_id_or_pathstr | Path

The HuggingFace model id or a local path. This corresponds to pretrained_model_name_or_path in HuggingFace Transformers.

paramnamestr
= ''

The name of the artifact to build. If not provided, defaults to model_id_or_path.

parammodel_configModelConfig | None
= None

Configuration for model HuggingFace settings (trust_remote_code, etc.).

paramparallel_configParallelConfig | None
= None

Configuration for parallelization (tensor and pipeline parallelism). Defaults to tensor_parallel_size=8, pipeline_parallel_size=1.

parambucket_configBucketConfig | None
= None

Configuration for attention buckets and sequence lengths. When all bucket fields are empty, a matching bucket preset is automatically applied based on the model metadata.

paramcompiler_configCompilerConfig | None
= None

Configuration for compiler and model rewriting options.

paramartifact_configArtifactConfig | None
= None

Configuration for artifact export options.

Methods

methodbuild(self, save_dir, *, num_pipeline_builder_workers=1, num_compile_workers=1, num_cpu_per_pipeline_build_worker=1, num_cpu_per_compile_worker=1, cache_dir=CACHE_DIR, param_file_path=None, param_saved_format='safetensors', param_file_max_shard_size='5GB', _cleanup=True, _raise_error_if_compile=False, **kwargs)

Build the artifacts for given model configurations.

paramself
paramsave_dirstr | os.PathLike

The path to save the artifacts. With artifacts, you can create LLM without quantizing or compiling the model again.

paramnum_pipeline_builder_workersint
= 1

The number of workers used for building pipelines (except for compilation). The default is 1 (no parallelism). Setting this value larger than 1 reduces pipeline building time, especially for large models, but requires much more memory.

paramnum_compile_workersint
= 1

The number of workers used for compilation. The default is 1 (no parallelism).

paramnum_cpu_per_pipeline_build_workerint
= 1
paramnum_cpu_per_compile_workerint
= 1
paramcache_diros.PathLike | None
= CACHE_DIR

The cache directory for all generated files for this LLM instance. When its value is None, caching is disabled. The default is "$HOME/.cache/furiosa/llm".

paramparam_file_pathos.PathLike | None
= None

The path to the parameter file to use for pipeline generation. If not specified, the parameters will be saved in a temporary file which will be deleted when LLM is destroyed.

paramparam_saved_formatLiteral['safetensors', 'pt']
= 'safetensors'

The format of the parameter file. Only possible value is "safetensors" now. The default is "safetensors".

paramparam_file_max_shard_sizestr | int | None
= '5GB'

The maximum size of single parameter file. Parameter file will be split into smaller files to be less than this size. The default is "5GB".

param_cleanupbool
= True
param_raise_error_if_compilebool
= False
paramkwargs
= {}

Returns

None

Artifact

classArtifact

A built model artifact: the compiled model together with its metadata, generator configuration, and schema version.

Attributes

attributemetadataArtifactMetadata
attributemodelModelArtifact
attributegenerator_configGeneratorConfig
attributeversionSchemaVersion

Methods

methodfrom_previous_versionfrom_previous_version(previous_version_artifact)

Construct an Artifact from a previous-version artifact. Conversion from previous versions is not currently supported.

methodexportexport(path)

Serialize the artifact to the given path as indented JSON.

ArtifactMetadata

classArtifactMetadata

Pydantic model describing the metadata of a built model artifact.

Attributes

attributeartifact_idstr
attributenamestr
attributetimestampint
attributefuriosa_llm_versionstr
attributefuriosa_compiler_versionstr
attributeincludes_composable_irbool

On this page