Skip to main content

Home/ Artificial Intelligence/ Group items tagged LMSYS

Rss Feed Group items tagged

Janos Haits

Chatbot Arena: Elo Rating Calculation (July 17, 2023) - Colab - 0 views

  •  
    "n this notebook, we will employ the Elo rating system to evaluate the performance of large language models (LLMs). The analysis is based on the pairwise battle results we collected from https://arena.lmsys.org between April 24 and July 17, 2023. This crowdsourcing way of data collection represents some use cases of LLMs in the wild. Below, we present the calculation procedure along with some basic analyses. To view the latest leaderboard, see https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard."
mikhail-miguel

LMSYS Chatbot Arena Vision (Multimodal): Benchmarking LLMs and VLMs in the Wild - 1 views

  •  
    The Chatbot Arena has launched a new beta feature supporting images, allowing users to interact with chatbots through images. Each conversation can include the submission of one image, as long as it is under 15MB. The Chatbot Arena logs user requests, including the images submitted, for research purposes. Although this data is not currently publicly disclosed, there may be a possibility of doing so in the future. Therefore, it is recommended that users avoid sending confidential or personal information through this feature. This feature is in its early development stage, so there may be issues or bugs. Users are encouraged to report any issues through the Chatbot Arena communication channels.
Janos Haits

Chat with Open Large Language Models - 0 views

  •  
    "Chatbot Arena: Benchmarking LLMs in the Wild"
1 - 20 of 54 Next › Last »
Showing 20 items per page