# Component

## What's a Component?

A `Component` is the **basic executable unit** in GLLM Core. It wraps a piece of async business logic and standardizes how that logic is:

1. **Discovered** via a single async entrypoint (`@main` or fallbacks).
2. **Executed** through a uniform `run(**kwargs)` method.
3. **Observed** via structured input/output events.
4. **Analyzed** so pipelines and orchestrators can understand its input contract.

At a high level:

1. **You implement a subclass** of `Component`.
2. **You mark one async method with `@main`** to declare the entrypoint.
3. **Pipelines never call your method directly**. They call `component.run(**kwargs)`.
4. **Input schemas can be generated** from the `@main` signature, enabling validation and argument construction.

This gives GLLM Core a **uniform abstraction** over heterogeneous logic: pipelines don't need to know whether a component is talking to an LLM, a database, an API, or anything else.

<details>

<summary>Prerequisites</summary>

This example specifically requires completion of all setup steps listed on the [prerequisites](https://gdplabs.gitbook.io/sdk/~/revisions/beykCxz0UanaEX0sPJJu/gen-ai-sdk/prerequisites "mention") page.

</details>

## Installation

{% tabs %}
{% tab title="Linux, macOS, or Windows WSL" %}

```bash
# you can use a Conda environment
pip install --extra-index-url https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/ gllm-core
```

{% endtab %}

{% tab title="Windows Powershell" %}

```powershell
# you can use a Conda environment
pip install --extra-index-url https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/ gllm-core
```

{% endtab %}

{% tab title="Windows Command Prompt" %}

```bash
# you can use a Conda environment
FOR /F "tokens=*" %T IN ('gcloud auth print-access-token') DO pip install --extra-index-url "https://oauth2accesstoken:%T@glsdk.gdplabs.id/gen-ai-internal/simple/"  "gllm-core"
```

{% endtab %}
{% endtabs %}

## Quickstart

### Define Your First Component

```python
from gllm_core.schema import Component, main


class TextFormatter(Component):
    @main
    async def format(self, text: str, uppercase: bool = False, repeat: int = 1) -> str:
        """Format text with options."""
        result = text.upper() if uppercase else text
        return result * repeat
```

Key points:

1. **Subclass** `Component`.
2. Define a single **async** method (here `format`).
3. Mark it with `@main` to tell the system: *"this is the entrypoint"*.

### Execute the Component Uniformly

```python
formatter = TextFormatter()

result = await formatter.run(text="hello", uppercase=True, repeat=2)
assert result == "HELLOHELLO"
```

You **never** call `await formatter.format(...)` from orchestration code. Instead, you always call `await formatter.run(**kwargs)`:

1. The Component base class emits **start/finish events**.
2. Logging is performed via the component's `_logger`.
3. Pipelines can treat every component the same way.

### Use the Generated Input Schema

The design for Components includes an `input_params` property that exposes a **Pydantic model** mirroring the `@main` signature:

```python
formatter = TextFormatter()
ParamsModel = formatter.input_params  # type: ignore[attr-defined]

params = ParamsModel(text="world", repeat=2)
result = await formatter.run(**params.model_dump())
```

This gives you:

1. Type-checked construction of arguments.
2. Easy validation and error reporting.
3. A single source of truth: the `@main` signature.

## The `@main` Decorator

### Purpose

The `@main` decorator marks **one async method** on a `Component` subclass as the canonical entrypoint. Architecturally, it enables:

1. **Entry-point abstraction**: pipelines dont need to know method names.
2. **Schema generation**: the `@main` signature drives `input_params`.
3. **Future interoperability**: the same entrypoint can later be wrapped as an MCP-compliant `Tool`.

From the docs and specs:

1. `Component.get_main()` resolves the entrypoint by honoring `@main`, `__main_method__`, or falling back to `_run`.
2. `Component.input_params` generates a Pydantic model from the resolved main method.
3. `Component.run(**kwargs)` executes the resolved main coroutine and emits events.

### How `@main` Fits into Resolution

The entrypoint resolution is conceptually:

1. **Prefer an explicitly decorated `@main` method** on the subclass.
2. If none is decorated, look for a class-level `__main_method__` override.
3. As a compatibility fallback, use `_run`.

This resolution is cached (via a resolver such as `MainMethodResolver`) so the cost of introspection is paid once per class.

## Component Lifecycle and Runtime Behavior

The `Component` base class provides a logger and a standard event flow:

1. `run(**kwargs)`
   1. Formats an **input event** with the component name and arguments.
   2. Logs it via `_logger`.
   3. Optionally emits it through an `EventEmitter` if one is passed in `kwargs`.
2. Calls the resolved main coroutine (or `_run` in the current implementation).
3. Formats and logs an **output event** containing the result.

Binary payloads (e.g., `bytes`) are handled via `binary_handler_factory` so logs show sizes or summaries instead of raw bytes.

## Designing Good Component APIs

### Prefer Clear, Typed Parameters

When defining your `@main` method:

1. Use **explicit type hints** for all parameters.
2. Provide **sensible defaults** where appropriate.
3. Reserve `**kwargs` for truly open-ended options.

Example:

```python
class DataProcessor(Component):
    @main
    async def process(
        self,
        data: list[dict],
        limit: int = 100,
        **options,
    ) -> dict:
        """Process data with optional filters."""
        processed = data[:limit]
        return {
            "count": len(processed),
            "data": processed,
            "options": options,
        }
```

With the planned `input_params` behavior, this will:

1. Generate a `DataProcessorParams` model.
2. Enforce types for `data` and `limit`.
3. Allow extra fields (because of `**options`) via `extra="allow"`.

### When to Use `**kwargs`

Use `**kwargs` when:

1. You truly don't know all the options ahead of time.
2. You want to **forward arbitrary parameters** to downstream systems.

Avoid it when:

1. You can name and type your parameters precisely.
2. You want strict validation and clear API docs.

***

## Migration from Legacy `_run` Components

If you have existing components that only implement `_run`, they continue to work:

```python
class LegacyComponent(Component):
    async def _run(self, message: str, priority: int = 1) -> str:
        """Legacy component using _run."""
        return f"[P{priority}] {message}"
```

You can still:

1. Call `await LegacyComponent().run(message="hi")`.
2. (Per the design) Rely on `input_params` to generate a matching Pydantic model.

To migrate to the new style:

1. Introduce a `@main` method that mirrors `_run`.
2. Gradually update callers to rely on the new entrypoint semantics and schema.
3. Eventually, stop depending on `_run` as a public surface.
