diff --git a/ivy_models/efficientnet/README.rst b/ivy_models/efficientnet/README.rst new file mode 100644 index 00000000..7e21f51c --- /dev/null +++ b/ivy_models/efficientnet/README.rst @@ -0,0 +1,91 @@ +.. image:: https://github.com/unifyai/unifyai.github.io/blob/main/img/externally_linked/logo.png?raw=true#gh-light-mode-only + :width: 100% + :class: only-light + +.. image:: https://github.com/unifyai/unifyai.github.io/blob/main/img/externally_linked/logo_dark.png?raw=true#gh-dark-mode-only + :width: 100% + :class: only-dark + + +.. raw:: html + +
+ + + + + + + + + + + + +
+ +EfficientNet +=========== + +`EfficientNet `_ represents a convolutional neural network structure and scaling approach that consistently adjusts depth, +width, and resolution using a combined coefficient. Diverging from traditional methods that unevenly modify these aspects, +the EfficientNet approach maintains uniform adjustments to network dimensions by utilizing a predefined set of scaling coefficients. + +This method employs a compound coefficient to ensure consistent modifications in network attributes, resulting in a more systematic approach. +The rationale behind the compound scaling method is intuitive: as the input image size increases, +the network requires additional layers to enhance its receptive field, along with increased channels to capture intricate patterns specific to larger images. + +Getting started +----------------- + +.. code-block:: python + + import ivy + from ivy_models.efficientnet import efficientnet_b0 + ivy.set_backend("torch") + + # Instantiate efficientnet_b0 model + ivy_efficientnet_b0 = efficientnet_b0(pretrained=True) + + # Convert the Torch image tensor to an Ivy tensor and adjust dimensions + img = ivy.asarray(torch_img.permute((0, 2, 3, 1)), dtype="float32", device="gpu:0") + + # Compile the Ivy efficientnet_b0 model with the Ivy image tensor + ivy_efficientnet_b0.compile(args=(img,)) + + # Pass the Ivy image tensor through the Ivy efficientnet_b0 model and apply softmax + output = ivy.softmax(ivy_efficientnet_b0(img)) + + # Get the indices of the top 3 classes from the output probabilities + classes = ivy.argsort(output[0], descending=True)[:3] + + # Retrieve the logits corresponding to the top 3 classes + logits = ivy.gather(output[0], classes) + + print("Indices of the top 3 classes are:", classes) + print("Logits of the top 3 classes are:", logits) + print("Categories of the top 3 classes are:", [categories[i] for i in classes.to_list()]) + + `Indices of the top 3 classes are: ivy.array([282, 281, 285], dev=gpu:0)` + `Logits of the top 3 classes are: ivy.array([0.60317987, 0.18620452, 0.07535177], dev=gpu:0)` + `Categories of the top 3 classes are: ['tiger cat', 'tabby', 'Egyptian cat']` + +Citation +-------- + +:: + + @article{ + title={EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks}, + author={Mingxing Tan and Quoc V. Le}, + journal={arXiv preprint arXiv:1905.11946}, + year={2020} + } + + + @article{lenton2021ivy, + title={Ivy: Templated deep learning for inter-framework portability}, + author={Lenton, Daniel and Pardo, Fabio and Falck, Fabian and James, Stephen and Clark, Ronald}, + journal={arXiv preprint arXiv:2102.02886}, + year={2021} + } diff --git a/ivy_models/resnet/README.rst b/ivy_models/resnet/README.rst new file mode 100644 index 00000000..1aaf616e --- /dev/null +++ b/ivy_models/resnet/README.rst @@ -0,0 +1,96 @@ +.. image:: https://github.com/unifyai/unifyai.github.io/blob/main/img/externally_linked/logo.png?raw=true#gh-light-mode-only + :width: 100% + :class: only-light + +.. image:: https://github.com/unifyai/unifyai.github.io/blob/main/img/externally_linked/logo_dark.png?raw=true#gh-dark-mode-only + :width: 100% + :class: only-dark + + +.. raw:: html + +
+ + + + + + + + + + + + +
+ +ResNet +=========== + +`ResNet `_ also known as Residual Network is a deep learning model utilized in computer vision tasks. +It's a type of Convolutional Neural Network (CNN) architecture designed to accommodate a large number of convolutional layers, +possibly ranging from hundreds to thousands. ResNet addresses the challenge of diminishing gradients, a problem encountered during training, +by introducing a concept called "skip connections." These connections enable the network to skip over several initial layers, +which consist of identity mappings (convolutional layers with no immediate impact). +This results in faster initial training by condensing the network into fewer layers. + +During subsequent retraining, the network expands to its full depth, and the sections of the network that weren't initially utilized, +called "residual parts," are given the opportunity to explore the features present in the input image more comprehensively. + +Getting started +----------------- + +.. code-block:: python + + import ivy + from ivy_models.resnet import resnet_34 + ivy.set_backend("torch") + + # Instantiate Resnet_34 model + ivy_resnet_34 = resnet_34(pretrained=True) + + # Convert the Torch image tensor to an Ivy tensor and adjust dimensions + img = ivy.asarray(torch_img.permute((0, 2, 3, 1)), dtype="float32", device="gpu:0") + + # Compile the Ivy Resnet_34 model with the Ivy image tensor + ivy_resnet_34.compile(args=(img,)) + + # Pass the Ivy image tensor through the Ivy Resnet_34 model and apply softmax + output = ivy.softmax(ivy_resnet_34(img)) + + # Get the indices of the top 3 classes from the output probabilities + classes = ivy.argsort(output[0], descending=True)[:3] + + # Retrieve the logits corresponding to the top 3 classes + logits = ivy.gather(output[0], classes) + + + print("Indices of the top 3 classes are:", classes) + print("Logits of the top 3 classes are:", logits) + print("Categories of the top 3 classes are:", [categories[i] for i in classes.to_list()]) + + `Indices of the top 3 classes are: ivy.array([282, 281, 285], dev=gpu:0)` + `Logits of the top 3 classes are: ivy.array([0.85072654, 0.13506058, 0.00688287], dev=gpu:0)` + `Categories of the top 3 classes are: ['tiger cat', 'tabby', 'Egyptian cat']` + +See `this demo `_ for more usage example. + +Citation +-------- + +:: + + @article{ + title={Deep Residual Learning for Image Recognition}, + author={Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun}, + journal={arXiv preprint arXiv:1512.03385}, + year={2015} + } + + + @article{lenton2021ivy, + title={Ivy: Templated deep learning for inter-framework portability}, + author={Lenton, Daniel and Pardo, Fabio and Falck, Fabian and James, Stephen and Clark, Ronald}, + journal={arXiv preprint arXiv:2102.02886}, + year={2021} + } diff --git a/ivy_models/unet/README.rst b/ivy_models/unet/README.rst index 9cf4d558..27e42843 100644 --- a/ivy_models/unet/README.rst +++ b/ivy_models/unet/README.rst @@ -27,7 +27,7 @@ U-Net =========== -`Unet `_ The UNET architecture and training approach effectively leverage data augmentation to make the most of +The `U-Net `_ architecture and training approach effectively leverage data augmentation to make the most of available annotated samples, even with limited data. The design features a contracting path for context capture and a symmetric expanding path for precise localization. Notably, this UNET network achieves superior performance with minimal images during end-to-end training. It surpasses previous methods, including a sliding-window convolutional network, in the ISBI challenge for segmenting neuronal structures in