Skip to content

Conversation

@andrewwan0131
Copy link

Why are these changes needed?

These changes enable users to upload PDF files as context for LLM queries.

Changes made

  1. Added PDF file handling capabilities:

    • Implemented PDF file upload support in the web interface
    • Added PDF text extraction functionality
    • Integrated extracted PDF content as context for LLM queries
  2. Modified relevant files:

    • Updated gradio web server components to handle PDF uploads
    • Added PDF processing utilities
    • Enhanced chat protocol to include document context

Checks

  • I've tested the PDF upload and context integration with various document types by running Chatbot Arena locally

Copy link
Member

@infwinston infwinston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andrewwan0131 left some comments!

Copy link
Member

@infwinston infwinston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks more comments

@CodingWithTim CodingWithTim self-requested a review December 26, 2024 23:16
@CodingWithTim CodingWithTim self-assigned this Dec 26, 2024
@CodingWithTim
Copy link
Collaborator

CodingWithTim commented Dec 30, 2024

@andrewwan0131 @PranavB-11 I resolved the old comments because it is no longer relevant. We can start commenting this new code as it is pretty different from before. The pdfchat is now operational, I will extensively test it and improve it next.

Next steps:

  1. Fix some existing UI issues which is bothering me at the moment.
  2. Integrate our language detection code into parse_pdf.
  3. Add pdf moderator.

Copy link
Member

@infwinston infwinston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @CodingWithTim ! left some quick comments

@CodingWithTim
Copy link
Collaborator

71 files changed?? 😭😭

@PranavB-11
Copy link
Collaborator

ohhh it was the formatting commit, it added a billion spaces to every file

This reverts commit 0955a76.
Copy link
Collaborator

@CodingWithTim CodingWithTim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewwan0131 @PranavB-11 @yixin-huang1 Great work guys! I only fixed a few small bugs and cleaned up the logics. Everything now works!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need to remove this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was accidentally created when I pushed the Black formatting commit so we reverted the changes.

@CodingWithTim
Copy link
Collaborator

CodingWithTim commented Jun 10, 2025

This PR was transfered to internal repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants