Skip to content

Commit

Permalink
Merge pull request #27 from FederatedAI/dev-2.1.1
Browse files Browse the repository at this point in the history
Dev 2.1.1
  • Loading branch information
mgqa34 authored Jun 28, 2024
2 parents ff20184 + 11fccd7 commit cff1134
Show file tree
Hide file tree
Showing 17 changed files with 602 additions and 126 deletions.
5 changes: 5 additions & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
## Release 2.1.1
### Major Features and Improvments
> Fate-Test: FATE Automated Testing Tool
* Add new subcommand `llmsuite` for FATE-LLM training and evaluation

## Release 2.1.0
### Major Features and Improvements
> Fate-Test: FATE Automated Testing Tool
Expand Down
16 changes: 14 additions & 2 deletions doc/fate_test.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ A collection of useful tools to running FATE tests and PipeLine tasks.
```bash
pip install -e python/fate_test
```
2. edit default fate\_test\_config.yaml
2. edit default fate\_test\_config.yaml; edit path to fate base/data base accordingly

```bash
# edit priority config file with system default editor
Expand Down Expand Up @@ -88,4 +88,16 @@ shown in last step
```bash
fate_test data generate -i <path contains *performance.yaml> -ng 10000 -fg 10 -fh 10 -m 1.0 --upload-data
fate_test performance -i <path contains *performance.yaml> --skip-data
```
```

- [llm-suite](./fate_test_command.md#llmsuite): used for running FATE-Llm testsuites, collection of FATE-Llm jobs and/or evaluations

Before running llmsuite for the first time, make sure to install FATE-Llm and allow its import in FATE-Test scripts:

```bash
fate_test config include fate-llm
```

```bash
fate_test llmsuite -i <path contains *llmsuite.yaml>
```
152 changes: 152 additions & 0 deletions doc/fate_test_command.md
Original file line number Diff line number Diff line change
Expand Up @@ -867,3 +867,155 @@ fate_test data --help
data after generate and upload dataset in testsuites
*path1*


## Llmsuite

Llmsuite is used for running a collection of FATE-Llm jobs in sequence and then evaluate them on user-specified tasks.
It also allows users to compare the results of different llm jobs.

### command options

```bash
fate_test llmsuite --help
```

1. include:

```bash
fate_test llmsuite -i <path1 contains *llmsuite.yaml>
```

will run llm testsuites in
*path1*

2. exclude:

```bash
fate_test llmsuite -i <path1 contains *llmsuite.yaml> -e <path2 to exclude> -e <path3 to exclude> ...
```

will run llm testsuites in *path1* but not in *path2* and *path3*

3. glob:

```bash
fate_test llmsuite -i <path1 contains *llmsuite.yaml> -g "hetero*"
```

will run llm testsuites in sub directory start with *hetero* of
*path1*

4. algorithm-suite:

```bash
fate_test llmsuite -a "pellm"
```

will run built-in 'pellm' llm testsuite, which will train and evaluate a FATE-Llm model and a zero-shot model

5. timeout:

```bash
fate_test llmsuite -i <path1 contains *llmsuite.yaml> -m 3600
```

will run llm testsuites in *path1* and timeout when job does not finish
within 3600s; if tasks need more time, use a larger threshold

6. task-cores

```bash
fate_test llmsuite -i <path1 contains *llmsuite.yaml> -p 4
```

will run llm testsuites in *path1* with script config "task-cores" set to 4

7. eval-config:

```bash
fate_test llmsuite -i <path1 contains *llmsuite.yaml> --eval-config <path2>
```

will run llm testsuites in *path1* with evaluation configuration set to *path2*

8. skip-evaluate:

```bash
fate_test llmsuite -i <path1 contains *llmsuite.yaml> --skip-evaluate
```

will run llm testsuites in *path1* without running evaluation

9. provider:

```bash
fate_test llmsuite -i <path1 contains *llmsuite.yaml> --provider <provider_name>
```

will run llm testsuites in *path1* with FATE provider set to *provider_name*

10. yes:

```bash
fate_test llmsuite -i <path1 contains *llmsuite.yaml> --yes
```

will run llm testsuites in *path1* directly, skipping double check


### FATE-Llm job configuration

Configuration of jobs should be specified in a llm testsuite whose
file name ends with "\*llmsuite.yaml". For llm testsuite example,
please refer [here](https://github.com/FederatedAI/FATE-LLM).

A FATE-Llm testsuite includes the following elements:

- job group: each group includes arbitrary number of jobs with paths
to corresponding script and configuration

- job: name of evaluation job to be run, must be unique within each group
list

- script: path to [testing script](#testing-script), should be
relative to testsuite, optional for evaluation-only jobs;
note that pretrained model, if available, should be returned at the end of the script
- conf: path to job configuration file for script, should be
relative to testsuite, optional for evaluation-only jobs
- pretrained: path to pretrained model, should be either model name from Huggingface or relative path to
testsuite, optional for jobs needed to run FATE-Llm training job, where the
script should return path to the pretrained model
- peft: path to peft file, should be relative to testsuite,
optional for jobs needed to run FATE-Llm training job
- tasks: list of tasks to be evaluated, optional for jobs skipping evaluation
- include_path: should be specified if tasks are user-defined
- eval_conf: path to evaluation configuration file, should be
relative to testsuite; if not provided, will use default conf

```yaml
bloom_lora:
pretrained: "models/bloom-560m"
script: "./test_bloom_lora.py"
conf: "./bloom_lora_config.yaml"
peft_path_format: "{{fate_base}}/fate_flow/model/{{job_id}}/guest/{{party_id}}/{{model_task_name}}/0/output/output_model/model_directory"
tasks:
- "dolly-15k"
```

- llm suite

```yaml
hetero_nn_sshe_binary_0:
bloom_lora:
pretrained: "bloom-560m"
script: "./test_bloom_lora.py"
conf: "./bloom_lora_config.yaml"
peft_path_format: "{{fate_base}}/fate_flow/model/{{job_id}}/guest/{{party_id}}/{{model_task_name}}/0/output/output_model/model_directory"
tasks:
- "dolly-15k"
bloom_zero_shot:
pretrained: "bloom-560m"
tasks:
- "dolly-15k"
```
12 changes: 6 additions & 6 deletions python/fate_test/_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,20 +36,20 @@
# st_config_directory: examples/flow_test_template/hetero_lr/flow_test_config.yaml
# directory stores testsuite file with min_test data sets to upload,
# default location={FATE}/examples/data/upload_config/min_test_data_testsuite.json
min_test_data_config: examples/data/upload_config/min_test_data_testsuite.json
# default location={FATE}/examples/data/upload_config/min_test_data_testsuite.yaml
min_test_data_config: examples/data/upload_config/min_test_data_testsuite.yaml
# directory stores testsuite file with all example data sets to upload,
# default location={FATE}/examples/data/upload_config/all_examples_data_testsuite.json
all_examples_data_config: examples/data/upload_config/all_examples_data_testsuite.json
# default location={FATE}/examples/data/upload_config/all_examples_data_testsuite.yaml
all_examples_data_config: examples/data/upload_config/all_examples_data_testsuite.yaml
# directory where FATE code locates, default installation location={FATE}/fate
# python/ml -> $fate_base/python/ml
fate_base: path(FATE)/fate
fate_base: path(FATE)/
# whether to delete data in suites after all jobs done
clean_data: true
# participating parties' id and correponding flow service ip & port information
# participating parties' id and corresponding flow service ip & port information
parties:
guest: ['9999']
host: ['10000', '9999']
Expand Down
85 changes: 25 additions & 60 deletions python/fate_test/_flow_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,29 @@ def __init__(self,
def set_address(self, address):
self.address = address

def bind_table(self, data: Data, callback=None):
conf = data.config
conf['file'] = os.path.join(str(self._data_base_dir), conf.get('file'))
path = Path(conf.get('file'))
if not path.exists():
raise Exception('The file is obtained from the fate flow client machine, but it does not exist, '
f'please check the path: {path}')
response = self._client.table.bind_path(path=str(path),
namespace=data.namespace,
name=data.table_name)
try:
if callback is not None:
callback(response)
status = str(response['message']).lower()
else:
status = response["message"]
code = response["code"]
if code != 0:
raise RuntimeError(f"Return code {code} != 0, bind path failed")
except BaseException:
raise ValueError(f"Bind path failed, response={response}")
return status

def transform_local_file_to_dataframe(self, data: Data, callback=None, output_path=None):
#data_warehouse = self.upload_data(data, callback, output_path)
#status = self.transform_to_dataframe(data.namespace, data.table_name, data_warehouse, callback)
Expand Down Expand Up @@ -82,44 +105,6 @@ def upload_file_and_convert_to_dataframe(self, data: Data, callback=None, output
self._awaiting(job_id, "local", 0)
return status

"""def upload_data(self, data: Data, callback=None, output_path=None):
response, file_path = self._upload_data(data, output_path=output_path)
try:
if callback is not None:
callback(response)
code = response["code"]
if code != 0:
raise ValueError(f"Return code {code}!=0")
namespace = response["data"]["namespace"]
name = response["data"]["name"]
job_id = response["job_id"]
except BaseException:
raise ValueError(f"Upload data fails, response={response}")
# self.monitor_status(job_id, role=self.role, party_id=self.party_id)
self._awaiting(job_id, "local", 0)
return dict(namespace=namespace, name=name)
def transform_to_dataframe(self, namespace, table_name, data_warehouse, callback=None):
response = self._client.data.dataframe_transformer(namespace=namespace,
name=table_name,
data_warehouse=data_warehouse)
try:
if callback is not None:
callback(response)
status = self._awaiting(response["job_id"], "local", 0)
status = str(status).lower()
else:
status = response["retmsg"]
except Exception as e:
raise RuntimeError(f"upload data failed") from e
job_id = response["job_id"]
self._awaiting(job_id, "local", 0)
return status"""

def delete_data(self, data: Data):
try:
table_name = data.config['table_name'] if data.config.get(
Expand Down Expand Up @@ -154,27 +139,6 @@ def _awaiting(self, job_id, role, party_id, callback=None):
callback(response)
time.sleep(1)

"""def _upload_data(self, data, output_path=None, verbose=0, destroy=1):
conf = data.config
# if conf.get("engine", {}) != "PATH":
if output_path is not None:
conf['file'] = os.path.join(os.path.abspath(output_path), os.path.basename(conf.get('file')))
else:
if _config.data_switch is not None:
conf['file'] = os.path.join(str(self._cache_directory), os.path.basename(conf.get('file')))
else:
conf['file'] = os.path.join(str(self._data_base_dir), conf.get('file'))
path = Path(conf.get('file'))
if not path.exists():
raise Exception('The file is obtained from the fate flow client machine, but it does not exist, '
f'please check the path: {path}')
response = self._client.data.upload(file=str(path),
head=data.head,
meta=data.meta,
extend_sid=data.extend_sid,
partitions=data.partitions)
return response, conf["file"]"""

def _output_data_table(self, job_id, role, party_id, task_name):
response = self._client.output.data_table(job_id, role=role, party_id=party_id, task_name=task_name)
if response.get("code") is not None:
Expand Down Expand Up @@ -223,7 +187,7 @@ def get_version(self):
"""def _add_notes(self, job_id, role, party_id, notes):
data = dict(job_id=job_id, role=role, party_id=party_id, notes=notes)
response = AddNotesResponse(self._post(url='job/update', json=data))
return response"""
return response
def _table_bind(self, data):
response = self._post(url='table/bind', json=data)
Expand All @@ -235,6 +199,7 @@ def _table_bind(self, data):
except Exception as e:
raise RuntimeError(f"table bind error: {response}") from e
return response
"""


class Status(object):
Expand Down
4 changes: 4 additions & 0 deletions python/fate_test/_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,10 @@ def echo(cls, message, **kwargs):
click.secho(message, **kwargs)
click.secho(message, file=cls._file, **kwargs)

@classmethod
def sep_line(cls):
click.secho("-------------------------------------------------")

@classmethod
def file(cls, message, **kwargs):
click.secho(message, file=cls._file, **kwargs)
Expand Down
Loading

0 comments on commit cff1134

Please sign in to comment.