fix: Enable Base64 audio input/output handling for API endpoints#2719
Open
neon-aiart wants to merge 5 commits intoRVC-Project:mainfrom
Open
fix: Enable Base64 audio input/output handling for API endpoints#2719neon-aiart wants to merge 5 commits intoRVC-Project:mainfrom
neon-aiart wants to merge 5 commits intoRVC-Project:mainfrom
Conversation
Implement Base64 audio encoding/decoding logic in vc_single
This commit focuses on the core logic within modules.py to fully enable Base64 audio
data flow for API endpoints (e.g., infer_convert).
1. Fix Base64 Input Handling (f0_file object):
- Corrects the bug where Base64 input failed because the function expected a file path.
- If f0_file is a file-like object, the code now extracts the path via the '.name' attribute
to ensure compatibility with subsequent file-based processing.
2. Implement Base64 Output:
- The converted audio (audio_opt) is now safely written to a temporary WAV file using soundfile.write.
- The WAV content is read and encoded into a Base64 Data URI (data:audio/wav;base64,...).
- The function now returns this Base64 data as the 3rd element in the return tuple,
allowing external tools to bypass the Gradio Audio component and receive raw data.
3. Fix vc_multi Compatibility:
- The return value of vc_single was increased from 2 to 3 elements.
- Updated the call site in vc_multi to accept the 3rd element as 'base64_opt'
to prevent runtime errors and ensure code clarity in batch conversion mode.
Implement wildcard file search to resolve audio path mismatch This commit addresses a file path resolution issue in audio.py, specifically when dealing with audio files generated with random suffixes in their filenames (e.g., temporary files). 1. Add glob import: Imports the standard 'glob' module for wildcard searching. 2. Wildcard Path Resolution: If os.path.exists(file) is False, a wildcard search is performed by inserting '*' before the file extension (e.g., 'file.wav' becomes 'file*.wav'). This correctly finds the temporary audio file even if it contains a random suffix, preventing the original 'file not found' error.
Implement wildcard file search to resolve audio path mismatch This commit addresses a file path resolution issue in audio.py, specifically when dealing with audio files generated with random suffixes in their filenames (e.g., temporary files). 1. Add glob import: Imports the standard 'glob' module for wildcard searching. 2. Wildcard Path Resolution: If os.path.exists(file) is False, a wildcard search is performed by inserting '*' before the file extension (e.g., 'file.wav' becomes 'file*.wav'). This correctly finds the temporary audio file even if it contains a random suffix, preventing the original 'file not found' error.
Implement Base64 audio encoding/decoding logic in vc_single
This commit focuses on the core logic within modules.py to fully enable Base64 audio
data flow for API endpoints (e.g., infer_convert).
1. Fix Base64 Input Handling (f0_file object):
- Corrects the bug where Base64 input failed because the function expected a file path.
- If f0_file is a file-like object, the code now extracts the path via the '.name' attribute
to ensure compatibility with subsequent file-based processing.
2. Implement Base64 Output:
- The converted audio (audio_opt) is now safely written to a temporary WAV file using soundfile.write.
- The WAV content is read and encoded into a Base64 Data URI (data:audio/wav;base64,...).
- The function now returns this Base64 data as the 3rd element in the return tuple,
allowing external tools to bypass the Gradio Audio component and receive raw data.
3. Fix vc_multi Compatibility:
- The return value of vc_single was increased from 2 to 3 elements.
- Updated the call site in vc_multi to accept the 3rd element as 'base64_opt'
to prevent runtime errors and ensure code clarity in batch conversion mode.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull request checklist
The PR has a proper title. Use Semantic Commit Messages.
Make sure this is ready to be merged into the relevant branch.
Ensure you can run the codes you submitted succesfully. These submissions will be prioritized for review:
Fix existing bugs reported by user feedback (or you met);
Introduce more convenient user operations.
PR type
Description
This PR implements essential fixes and extensions to enable robust Base64 audio
data transmission for API endpoints (e.g., infer_convert), directly addressing existing
input failures.
Rationale and Value
This change addresses a critical gap where Base64 handling, despite being implied in API documentation,
was non-functional.
JavaScript running in a browser) to utilize RVC's API, significantly boosting integration
potential. This is already proven utility with the 'Neon Spitch Link' UserScript.
processed correctly as per API expectations, making the feature fully functional as intended
by the API design.
Detailed Changes
This change revolves around resolving input failures and providing a clean output path.
1. Core Logic Fix (modules.py)
2. Path Reliability Fix (audio.py)
globwildcard search inload_audioto successfully resolve temporary audio file paths containing random suffixes, preventing 'file not found' errors.3. WebUI Component Update (infer-web.py)
What will it affect
Screenshot