BitMoE - Pytorch
Project description
BitMoE
1 bit Mixture of Experts utilizing BitNet ++ Mixture of Experts. Also will add distribution amongst GPUs.
install
$ pip3 install bitmoe
usage
import torch
from bitmoe.main import BitMoE
# Set the parameters
dim = 10 # Dimension of the input
hidden_dim = 20 # Dimension of the hidden layer
output_dim = 30 # Dimension of the output
num_experts = 5 # Number of experts in the BitMoE model
# Create the model
model = BitMoE(dim, hidden_dim, output_dim, num_experts)
# Create random inputs
batch_size = 32 # Number of samples in a batch
sequence_length = 100 # Length of the input sequence
x = torch.randn(batch_size, sequence_length, dim) # Random input tensor
# Forward pass
output = model(x) # Perform forward pass using the model
# Print the output shape
print(output) # Print the output tensor
print(output.shape) # Print the shape of the output tensor
License
MIT
Todo
- Implement better gating mechanisms
- Implement better routing algorithm
- Implement better BitFeedForward
- Implement
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
bitmoe-0.0.2.tar.gz
(4.3 kB
view hashes)
Built Distribution
bitmoe-0.0.2-py3-none-any.whl
(4.1 kB
view hashes)