ArtifactBuilder
API reference for ArtifactBuilder, Artifact, and ArtifactMetadata in furiosa_llm.
ArtifactBuilder
classArtifactBuilderThe artifact builder to use in the Furiosa LLM.
paramselfparammodel_id_or_pathstr | PathThe HuggingFace model id or a local path. This corresponds to pretrained_model_name_or_path in HuggingFace Transformers.
paramnamestr= ''The name of the artifact to build. If not provided, defaults to model_id_or_path.
parammodel_configModelConfig | None= NoneConfiguration for model HuggingFace settings (trust_remote_code, etc.).
paramparallel_configParallelConfig | None= NoneConfiguration for parallelization (tensor and pipeline parallelism). Defaults to tensor_parallel_size=8, pipeline_parallel_size=1.
parambucket_configBucketConfig | None= NoneConfiguration for attention buckets and sequence lengths. When all bucket fields are empty, a matching bucket preset is automatically applied based on the model metadata.
paramcompiler_configCompilerConfig | None= NoneConfiguration for compiler and model rewriting options.
paramartifact_configArtifactConfig | None= NoneConfiguration for artifact export options.
Methods
methodbuild(self, save_dir, *, num_pipeline_builder_workers=1, num_compile_workers=1, num_cpu_per_pipeline_build_worker=1, num_cpu_per_compile_worker=1, cache_dir=CACHE_DIR, param_file_path=None, param_saved_format='safetensors', param_file_max_shard_size='5GB', _cleanup=True, _raise_error_if_compile=False, **kwargs)Build the artifacts for given model configurations.
paramselfparamsave_dirstr | os.PathLikeThe path to save the artifacts. With artifacts, you can create LLM without quantizing or compiling the model again.
paramnum_pipeline_builder_workersint= 1The number of workers used for building pipelines (except for compilation). The default is 1 (no parallelism). Setting this value larger than 1 reduces pipeline building time, especially for large models, but requires much more memory.
paramnum_compile_workersint= 1The number of workers used for compilation. The default is 1 (no parallelism).
paramnum_cpu_per_pipeline_build_workerint= 1paramnum_cpu_per_compile_workerint= 1paramcache_diros.PathLike | None= CACHE_DIRThe cache directory for all generated files for this LLM instance.
When its value is None, caching is disabled. The default is "$HOME/.cache/furiosa/llm".
paramparam_file_pathos.PathLike | None= NoneThe path to the parameter file to use for pipeline generation.
If not specified, the parameters will be saved in a temporary file which will be
deleted when LLM is destroyed.
paramparam_saved_formatLiteral['safetensors', 'pt']= 'safetensors'The format of the parameter file. Only possible value is "safetensors" now. The default is "safetensors".
paramparam_file_max_shard_sizestr | int | None= '5GB'The maximum size of single parameter file. Parameter file will be split into smaller files to be less than this size. The default is "5GB".
param_cleanupbool= Trueparam_raise_error_if_compilebool= Falseparamkwargs= {}Returns
NoneArtifact
classArtifactA built model artifact: the compiled model together with its metadata, generator configuration, and schema version.
Attributes
attributemetadataArtifactMetadataattributemodelModelArtifactattributegenerator_configGeneratorConfigattributeversionSchemaVersionMethods
methodfrom_previous_versionfrom_previous_version(previous_version_artifact)Construct an Artifact from a previous-version artifact. Conversion from previous versions is not currently supported.
methodexportexport(path)Serialize the artifact to the given path as indented JSON.
ArtifactMetadata
classArtifactMetadataPydantic model describing the metadata of a built model artifact.
Attributes
attributeartifact_idstrattributenamestrattributetimestampintattributefuriosa_llm_versionstrattributefuriosa_compiler_versionstrattributeincludes_composable_irbool