Version 0.2.0a1
Released December 21st, 2025
Added
- Global, per-connection, and per-source configuration system for extensive customization (bitrate, channels, volume, etc.).
MEMORY/DISKbuffer modes for audio playback. Details below.VoiceClientnow automatically cleans up when signalled byhikari(stopping/closing).
Changed
FFmpegblocksize changed from default8 KBto32 KBto reduce I/O overhead and improve streaming stability.~AudioSourceimplementations refactored and cleaned up.
Fixed
- Added guard to internal voice gateway packet receiver in the case that the websocket doesn't exist anymore.
Audio Buffer System Implementation
- To further
hikari-wave's goal of being modern and scalable, we have decided to implement a custom audio buffering system. - This system, by configuration, will allow developers to decide how much
RAMthe audio system will use. - The system allows developers to set the buffer mode, either
MEMORY(default) orDISK. - The
MEMORYmode stores all pre-encoded audio frames (that play in the audio player) in an internal, in-memory (RAM) buffer. - The
DISKmode stores all pre-encoded audio frames in files in the program's directory, underwavecache/<guild_id>/. - When using the
DISKmode, developers can set how big they want these files using thebufferconfig in theConfigobject in theVoiceClientconstructor, or theset_configfor a desiredVoiceConnection. Thedurationoption decides how many seconds of frames get stored in each file. - When benchmarking with our newest
FFmpegprocess-pooling system,100active audio sessions would take~250 MBof RAM (buffer audio alone). To further our scaling capability, and developer experience, we wanted a more efficient and cost-effective solution. - We understand most Discord voice bots will be private, so either they'll be self-hosted on personal computers or hosted on cheap VPSes/servers.
- This gives the option to dwindle
RAMusage down to4-8 KB/audio source all the way up to3-5 MB/audio source, depending on mode and implementation, with tradeoffs ofRAMusage to disk space - for a grand total usage anywhere from50 MBto300-500 MBof totalRAMusage, includingFFmpegprocesses. - The default profile is
MEMORY, as it's expected behavior from most voice libraries, but developers have the easy option to configure otherwise.