swarms

2 years ago · 5759a9f42f
parent eaccca8146
commit 5759a9f42f
2 changed files with 109 additions and 1 deletions
--- a/swarms/tools/README.md
+++ b/swarms/tools/README.md
@ -57,4 +57,58 @@ class WriteFileTool(BaseTool):
            return f"Error: {e}"
 ```

-This tool takes the name of the file and the content to be written as parameters, writes the content to the file in the specified directory, and returns a success message. In case of any error, it returns the error message. You would follow a similar process to create the other tools.
+This tool takes the name of the file and the content to be written as parameters, writes the content to the file in the specified directory, and returns a success message. In case of any error, it returns the error message. You would follow a similar process to create the other tools.
+
+
+
+
+For completing browser-based tasks, you can use web automation tools. These tools allow you to interact with browsers as if a human user was interacting with it. Here are 20 tasks that individual worker swarm nodes can handle:
+
+1. Open Browser Tool: Open a web browser.
+2. Close Browser Tool: Close the web browser.
+3. Navigate To URL Tool: Navigate to a specific URL.
+4. Fill Form Tool: Fill in a web form with provided data.
+5. Submit Form Tool: Submit a filled form.
+6. Click Button Tool: Click a button on a webpage.
+7. Hover Over Element Tool: Hover over a specific element on a webpage.
+8. Scroll Page Tool: Scroll up or down a webpage.
+9. Navigate Back Tool: Navigate back to the previous page.
+10. Navigate Forward Tool: Navigate forward to the next page.
+11. Refresh Page Tool: Refresh the current page.
+12. Switch Tab Tool: Switch between tabs in a browser.
+13. Capture Screenshot Tool: Capture a screenshot of the current page.
+14. Download File Tool: Download a file from a webpage.
+15. Send Email Tool: Send an email using a web-based email service.
+16. Login Tool: Log in to a website using provided credentials.
+17. Search Website Tool: Perform a search on a website.
+18. Extract Text Tool: Extract text from a webpage.
+19. Extract Image Tool: Extract image(s) from a webpage.
+20. Browser Session Management Tool: Handle creation, usage, and deletion of browser sessions.
+
+You would typically use a library like Selenium, Puppeteer, or Playwright to automate these tasks. Here's an example of how you might define the FillFormTool using Selenium in Python:
+
+```python
+from selenium import webdriver
+from langchain.tools import BaseTool
+
+class FillFormTool(BaseTool):
+    name = "fill_form"
+    description = "Fill in a web form with provided data."
+
+    def _run(self, field_dict: dict) -> str:
+        """Fills a web form with the data in field_dict."""
+        try:
+            driver = webdriver.Firefox()
+            
+            for field_name, field_value in field_dict.items():
+                element = driver.find_element_by_name(field_name)
+                element.send_keys(field_value)
+
+            return "Form filled successfully."
+        except Exception as e:
+            return f"Error: {e}"
+```
+
+In this tool, `field_dict` is a dictionary where the keys are the names of the form fields and the values are the data to be filled in each field. The tool finds each field in the form and fills it with the provided data.
+
+Please note that in a real scenario, you would need to handle the browser driver session more carefully (like closing the driver when it's not needed anymore), and also handle waiting for the page to load and exceptions more thoroughly. This is a simplified example for illustrative purposes.
--- a/swarms/tools/main.py
+++ b/swarms/tools/main.py
@ -1320,3 +1320,57 @@ class VisualQuestionAnswering(BaseToolSet):
 #segment anything:

 ########################### MODELS
+
+
+#########==========================> 
+from selenium import webdriver
+from langchain.tools import BaseTool
+
+class BrowserActionTool(BaseTool):
+    name = "browser_action"
+    description = "Perform a browser action."
+
+    def _run(self, action_type: str, action_details: dict) -> str:
+        """Perform a browser action based on action_type and action_details."""
+
+        try:
+            driver = webdriver.Firefox()
+
+            if action_type == 'Open Browser':
+                pass  # Browser is already opened
+            elif action_type == 'Close Browser':
+                driver.quit()
+            elif action_type == 'Navigate To URL':
+                driver.get(action_details['url'])
+            elif action_type == 'Fill Form':
+                for field_name, field_value in action_details['fields'].items():
+                    element = driver.find_element_by_name(field_name)
+                    element.send_keys(field_value)
+            elif action_type == 'Submit Form':
+                element = driver.find_element_by_name(action_details['form_name'])
+                element.submit()
+            elif action_type == 'Click Button':
+                element = driver.find_element_by_name(action_details['button_name'])
+                element.click()
+            elif action_type == 'Scroll Down':
+                driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
+            elif action_type == 'Scroll Up':
+                driver.execute_script("window.scrollTo(0, 0);")
+            elif action_type == 'Go Back':
+                driver.back()
+            elif action_type == 'Go Forward':
+                driver.forward()
+            elif action_type == 'Refresh':
+                driver.refresh()
+            elif action_type == 'Execute Javascript':
+                driver.execute_script(action_details['script'])
+            elif action_type == 'Switch Tab':
+                driver.switch_to.window(driver.window_handles[action_details['tab_index']])
+            elif action_type == 'Close Tab':
+                driver.close()
+            else:
+                return f"Error: Unknown action type {action_type}."
+
+            return f"Action {action_type} completed successfully."
+        except Exception as e:
+            return f"Error: {e}"